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DESCRIPTION 

MOSQUITO OLFACTORY GENE, POLYPEPTIDES, AND 
METHODS OF USE THEREOF 



GOVERNMENT SUPPORT CLAUSE 

10 This invention was made with federal grant money under NIH grant 

1 R01 DC04692-01 and NSF grant 0075338. The United States Government 
has certain rights in this invention. 

A portion of the disclosure of this patent document contains material 
which is subject to copyright protection. The copyright owner has no 

15 objection to the facsimile reproduction by anyone of the patent document or 
the patent disclosure, as it appears in the Patent and Trademark Office 
patent file or records, but otherwise reserves all copyright rights 
whatsoever. 

20 TECHNICAL FIELD 

The present invention relates generally to the field of host 
identification by insects. Specifically, the present invention relates to the 
identification and cloning of genes related to mosquito olfaction, 
identification and purification of polypeptides thereof, and methods of use 
25 thereof. 

BACKGROUND ART 

The ability of an insect to respond to chemical stimuli is necessary for 
the insect to reproduce, mate, and feed. For example, insects respond to 
30 certain chemical stimuli by moving up a chemical gradient to identify and 
target a host. Mosquitoes, in particular, are believed to use olfaction to 
identify and target sources of bloodmeal for reproductive purposes. This 
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behavior contributes to the spread of diseases in humans, such as malaria, 
encephalitis, and dengue fever; as well as, animal and livestock disease. 

Olfaction plays a critical role in insect behaviors among agricultural 
pests and disease vectors. Hildebrand, et al., 1997, Annu. Rev. Neurosci, 
5 20:595-631. In Brosophila melanogaster (the common fruit fly), the olfactory 
system functions through a rapid cycling between an on and off state of 
certain regulatory molecules. The olfactory signal transduction cascade is 
"turned on" by ligand-based activation of an odorant receptor and 
transduction of the signal by G-protein coupled second messenger pathways 

10 Boekhoff et ah, 1994, J. Neurosci, 14:3304-9. The "on signal" is rapidly and 
substantially terminated in the Drosophila system through the modification 
of the odorant receptor such that the G-protein coupled second messenger 
pathway is deactivated. Dohlman et al, 1991, Annual Review of 
Biochemistry, 60:653-88. Olfactory transduction is provided by second 

15 messenger pathways of G protein-coupled receptors. Reed, R., 1992, Neuron 
8:205-209; Bloekhoff, et al, 1994, Neurosci 14:3304-3309. 

The structural and functional characteristics of the mosquito olfactory 
system has not been characterized to date. Given the importance of the 
controlling this pest and disease vector, what is needed is the identification 

20 and characterization of the genes and polypeptides that function for 
mosquito olfaction and methods of use thereof for mosquito management. 

DISCLOSURE OF THE INVENTION 

The present invention provides, in part, eight novel mosquito 
25 polypeptides and nucleic acids encoding the polypeptides (collectively 
referred to herein as "mosquito olfaction molecules"). Seven of the 
polypeptides are novel mosquito odorant receptors and the eighth is a novel 
mosquito arrestin molecule (see Figure 8). The odorant receptor molecules 
are discovered to function in a ligand-induced signal transduction pathway 
30 for the activation of mosquito olfaction. The mosquito arrestin molecule is 
discovered to function to inhibit the activated signal transduction cascade. 
Thus, the odorant receptors can be viewed as parts of an "on switch" or an 
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"on signal" and the arrestin molecule can be viewed as an "off switch" or an 
"off signal" for the odorant detection system of the mosquito. The present 
invention is not bound by theory or mechanism. 

The present invention also provides, in part, a system for disrupting 
the mosquito olfactory system by disrupting, inhibiting, or otherwise 
interfering with the function of the off switch for mosquito olfaction. Such 
interference is contemplated to inhibit or degrade the ability of the 
mosquito to appropriately respond to chemical clues in the environment 
used by the mosquito for host identification and targeting. For, example, if 
the signal cascade cannot be terminated or inhibited, then the mosquito is 
impaired in following a chemical gradient to a host through sampling of 
the frequency of ligand- induced activation of the olfaction signal cascade. 
In this example, the chemical concentration of the odorant is expected to 
increase with decreasing distance to the target. Thus, receptor activation 
is expected to increase with decreasing distance to the target. It is a 
discovery of the present invention, that factors that inhibit the on and off 
cycling of the mosquito olfactory signal cascade through inhibition of 
signal deactivation are useful for the control of mosquitoes. Test agents 
used in a method for identifying mosquito olfaction molecule binding 
compounds would include, but are not limited to: chemicals, proteins, 
peptides, organic compounds and lipids. Such factors that inhibit signal 
deactivation may be peptides and chemicals. Several classes of chemicals 
that would be selected as targets are the carboxylic acids and steroids that 
are components of human sweat. Cork, A. (1996). Olfactory sensing is the 
basis of host location by mosquitoes and other hematophagous Diptera. In 
Olfaction in Mosquito-Host Interactions, G. R. B. a. G. Cardew, ed. 
(Chichester, New York, Brisbane, Toronto, Singapore: John Wiley & Sons), 
pp. 71-84. Furthermore, certain aspects of the present invention are 
contemplated to be effective for insects in general. 

Methods are presented for identifying compounds that interfere with 
the operation of the mosquito olfactory system resulting in an over 
stimulation of olfactory signaling. One consequence of interfering with the 
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mosquito olfactory system is that the mosquito has a diminished ability to 
home in on sources of bloodmeal. Additionally, interfering with mosquito 
insect olfactory systems will inhibit mating and feeding having a significant 
impact on mosquito populations and is helpful, for example, in nuisance and 
5 disease vector control for humans and livestock. Interfering with non- 
mosquito insect olfaction will similarly have a positive impact in control of 
other insect populations including for the protection of crops, such as: wheat, 
corn, rice, cotton, and soybeans. Thus, certain aspects of the present 
invention provide screening assays for the identification of compositions that 

10 will reduce the ability of mosquitoes to locate sources of bloodmeal, such as 
humans and other mammals, including livestock (cattle, pigs, horses, sheep, 
etc.), show animals (horses, pigs, sheep, dogs, cats, etc.), and pets (dogs, cats, 
horses, etc). Certain aspects of the present invention provide a screening 
assay for the production of "mosquito olfaction molecules." 

15 One aspect of the present invention provides an isolated DNA 

comprising a nucleotide sequence that encodes arrestin 1 polypeptide (e.g., 
SEQ ID NO: 2). In certain embodiments, arrestin 1 nucleotide sequence 
comprises a DNA molecule that hybridizes under stringent conditions to a 
DNA having a nucleotide sequence consisting of SEQ ID NO: 1, or the 

20 complement of SEQ ID NO: 1. Preferably the isolated DNA encodes 
naturally-occurring Anopheles gambiae arrestin 1 polypeptides. In certain 
embodiments, the nucleotide sequence may be that of SEQ ID NO: 1. In 
alternate embodiments, the nucleotide sequence may encode a fragment of 
SEQ ID NO: 2 at least 20 residues in length. One of ordinary skill in the art 

25 knows that a polypeptide fragment having a length of 20 residues is capable 
of functioning as an immunogen. In certain embodiments, the nucleotide 
sequence may encode a polypeptide having a conservatively modified amino 
acid sequence of SEQ ID NO: 2. In certain embodiments, the isolated 
polynucleotide comprises a complement to a sequence that encodes a 

30 polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO: 2, and conservatively modified SEQ ID NO: 2. In 
alternate embodiments, the nucleotide sequence may be that of degenerate 
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variants of above-mentioned sequences. The invention also includes operably 
linking one or more expression control sequences to any of the above- 
mentioned nucleotide sequences. The invention also includes a cell 
comprising any of the above-mentioned nucleotide sequences operably linked 
5 to one or more expression control sequences. 

The present invention also provides an isolated DNA comprising a 
nucleotide sequence that encodes odorant receptor 1 polypeptide {e.g., SEQ 
ID NO: 4). In certain embodiments, odorant receptor 1 nucleotide sequence 
comprises a DNA molecule that hybridizes under stringent conditions to a 

10 DNA having a nucleotide sequence consisting of SEQ ID NO: 3, or the 
complement of SEQ ID NO: 3. Preferably the isolated DNA encodes 
naturally-occurring Anopheles gambiae odorant receptor 1 polypeptides. In 
certain embodiments, the nucleotide sequence may be that of SEQ ID NO: 3. 
In alternate embodiments, the nucleotide sequence may encode a fragment 

15 of SEQ ID NO: 4 at least 20 residues in length. One of ordinary skill in the 
art knows that a polypeptide fragment having a length of 20 residues is 
capable of functioning as an immunogen. In certain embodiments, the 
nucleotide sequence may encode a polypeptide having a conservatively 
modified amino acid sequence of SEQ ID NO: 4. In certain embodiments, the 

20 isolated polynucleotide comprises a complement to a sequence that 
encodes a polypeptide having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 4, and conservatively modified SEQ ID 
NO: 4. In other alternate embodiments, the nucleotide sequence may be that 
of degenerate variants of above-mentioned sequences. The invention also 

25 includes operably linking one or more expression control sequences to any of 
the above-mentioned nucleotide sequences. The invention also includes a cell 
comprising any of the above-mentioned nucleotide sequences operably linked 
to one or more expression control sequences. 

The present invention provides an isolated DNA comprising a 

30 nucleotide sequence that encodes odorant receptor 2 polypeptide (e.g., SEQ 
ID NO: 6). In certain embodiments, odorant receptor 2 nucleotide sequence 
comprises a DNA molecule that hybridizes under stringent conditions to a 
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DNA having a nucleotide sequence consisting of SEQ ID NO: 5, or the 
complement of SEQ ID NO: 5. Preferably the isolated DNA encodes 
naturally-occurring Anopheles gambiae odorant receptor 2 polypeptides. In 
certain embodiments, the nucleotide sequence may be that of SEQ ID NO: 5. 
5 In alternate embodiments, the nucleotide sequence may encode a fragment 
of SEQ ID NO: 6 at least 20 residues in length. One of ordinary skill in the 
art knows that a polypeptide fragment having a length of 20 residues is 
capable of functioning as an immunogen. In certain embodiments, the 
nucleotide sequence may encode a polypeptide having a conservatively 

10 modified amino acid sequence of SEQ ID NO: 6. In certain embodiments, the 
isolated polynucleotide comprises a complement to a sequence that 
encodes a polypeptide haying an amino acid sequence selected from the 
group consisting of SEQ ID NO: 6, and conservatively modified SEQ ID 
NO: 6. In other alternate embodiments, the nucleotide sequence may be that 

15 of degenerate variants of above-mentioned sequences. The invention also 
includes operably linking one or more expression control sequences to any of 
the above-mentioned nucleotide sequences. The invention also includes a cell 
comprising any of the above-mentioned nucleotide sequences operably linked 
to one or more expression control sequences. 

20 The present invention also provides an isolated DNA comprising a 

nucleotide sequence that encodes odorant receptor 3 polypeptide {e.g., SEQ 
ID NO: 8). In certain embodiments, odorant receptor 3 nucleotide sequence 
comprises a DNA molecule that hybridizes under stringent conditions to a 
DNA having a nucleotide sequence consisting of SEQ ID NO: 7, or the 

25 complement of SEQ ID NO: 7. Preferably the isolated DNA encodes 
naturally- occurring Anopheles gambiae odorant receptor 3 polypeptides. In 
certain embodiments, the nucleotide sequence may be that of SEQ ID NO: 7. 
In alternate embodiments, the nucleotide sequence may encode a fragment 
of SEQ ID NO: 8 at least 20 residues in length. One of ordinary skill in the 

30 art knows that a polypeptide fragment having a length of 20 residues is 
capable of functioning as an immunogen. In certain embodiments, the 
nucleotide sequence may encode a polypeptide having a conservatively 
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modified amino acid sequence of SEQ ID NO: 8. In certain embodiments, the 
isolated polynucleotide comprises a complement to a sequence that 
encodes a polypeptide having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 8, and conservatively modified SEQ ID 
5 NO: 8. In other alternate embodiments, the nucleotide sequence may be that 
of degenerate variants of above-mentioned sequences. The invention also 
includes operably hnking one or more expression control sequences to any of 
the above-mentioned nucleotide sequences. The invention also includes a cell 
comprising any of the above-mentioned nucleotide sequences operably linked 

10 to one or more expression control sequences. 

The present invention also provides an isolated DNA comprising a 
nucleotide sequence that encodes odorant receptor 4 polypeptide (e.g., SEQ 
ID NO: 14). In certain embodiments, odorant receptor 4 nucleotide sequence 
comprises a DNA molecule that hybridizes under stringent conditions to a 

15 DNA having a nucleotide sequence consisting of SEQ ID NO: 13, or the 
complement of SEQ ID NO: 13. Preferably the isolated DNA encodes 
naturally-occurring Anopheles gambiae odorant receptor 4 polypeptides. In 
certain embodiments, the nucleotide sequence may be that of SEQ ID NO: 
13. In alternate embodiments, the nucleotide sequence may encode a 

20 fragment of SEQ ID NO: 14 at least 20 residues in length. One of ordinary 
skill in the art knows that a polypeptide fragment having a length of 20 
residues is capable of functioning as an immunogen. In certain 
embodiments, the nucleotide sequence may encode a polypeptide having a 
conservatively modified amino acid sequence of SEQ ID NO: 14. In certain 

25 embodiments, the isolated polynucleotide comprises a complement to a 
sequence that encodes a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 14, and conservatively 
modified SEQ ID NO: 14. In other alternate embodiments, the nucleotide 
sequence may be that of degenerate variants of above-mentioned sequences. 

30 The invention also includes operably linking one or more expression control 
sequences to any of the above-mentioned nucleotide sequences. The 
invention also includes a cell comprising any of the above-mentioned 



WO 02/059274 



PCT7US02/02549 



8 

nucleotide sequences operably linked to one or more expression control 
sequences. 

The present invention also provides an isolated DNA comprising a 
nucleotide sequence that encodes odorant receptor 5 polypeptide (e.g., SEQ 
5 ID NO: 16). In certain embodiments, odorant receptor 5 nucleotide sequence 
comprises a DNA molecule that hybridizes under stringent conditions to a 
DNA having a nucleotide sequence consisting of SEQ ID NO: 15, or the 
complement of SEQ ID NO: 15. Preferably the isolated DNA encodes 
naturally-occurring Anopheles gambiae odorant receptor 5 polypeptides. In 

10 certain embodiments, the nucleotide sequence may be that of SEQ ID NO: 
15. In alternate embodiments, the nucleotide sequence may encode a 
fragment of SEQ ID NO: 16 at least 20 residues in length. One of ordinary 
skill in the art knows that a polypeptide fragment having a length of 20 
residues is capable of functioning as an immunogen. In certain 

15 embodiments, the nucleotide sequence may encode a polypeptide having a 
conservatively modified amino acid sequence of SEQ ID NO: 16. In certain 
embodiments, the isolated polynucleotide comprises a complement to a 
sequence that encodes a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 16, and conservatively 

20 modified SEQ ID NO: 16. In other alternate embodiments, the nucleotide 
sequence may be that of degenerate variants of above-mentioned sequences. 
The invention also includes operably linking one or more expression control 
sequences to any of the above-mentioned nucleotide sequences. The 
invention also includes a cell comprising any of the above-mentioned 

25 nucleotide sequences operably linked to one or more expression control 
sequences. 

The present invention also provides an isolated DNA comprising a 
nucleotide sequence that encodes odorant receptor 6 polypeptide (e.g., SEQ 
ID NO: 18). In certain embodiments, odorant receptor 6 nucleotide sequence 
30 comprises a DNA molecule that hybridizes under stringent conditions to a 
DNA having a nucleotide sequence consisting of SEQ ID NO: 17, or the 
complement of SEQ ID NO: 17. Preferably the isolated DNA encodes 
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naturally-occurring Anopheles gambiae odorant receptor 6 polypeptides. In 
certain embodiments, the nucleotide sequence may be that of SEQ ID NO: 
17. In alternate embodiments, the nucleotide sequence may encode a 
fragment of SEQ ID NO: 18 at least 20 residues in length. One of ordinary 
5 skill in the art knows that a polypeptide fragment having a length of 20 
residues is capable of functioning as an immunogen. In certain 
embodiments, the nucleotide sequence may encode a polypeptide having a 
conservatively modified amino acid sequence of SEQ ID NO: 18. In certain 
embodiments, the isolated polynucleotide comprises a complement to a 

10 sequence that encodes a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 18, and conservatively 
modified SEQ ID NO: 18. In other alternate embodiments, the nucleotide 
sequence may be that of degenerate variants of above-mentioned sequences. 
The invention also includes operably linking one or more expression control 

15 sequences to any of the above-mentioned nucleotide sequences. The 
invention also includes a cell comprising any of the above-mentioned 
nucleotide sequences operably linked to one or more expression control 
sequences. 

The present invention also provides an isolated DNA comprising a 
20 nucleotide sequence that encodes odorant receptor 7 polypeptide (e.g., SEQ 
ID NO: 20). In certain embodiments, odorant receptor 7 nucleotide sequence 
comprises a DNA molecule that hybridizes under stringent conditions to a 
DNA having a nucleotide sequence consisting of SEQ ID NO: 19, or the 
complement of SEQ ID NO: 19. Preferably the isolated DNA encodes 
25 naturally-occurring Anopheles gambiae odorant receptor 7 polypeptides. In 
certain embodiments, the nucleotide sequence may be that of SEQ ID NO: 
19. In alternate embodiments, the nucleotide sequence may encode a 
fragment of SEQ ID NO: 20 at least 20 residues in length. One of ordinary 
skill in the art knows that a polypeptide fragment having a length of 20 
30 residues is capable of functioning as an immunogen. In certain 
embodiments, the nucleotide sequence may encode a polypeptide having a 
conservatively modified amino acid sequence of SEQ ID NO: 20. In certain 
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embodiments, the isolated polynucleotide comprises a complement to a 
sequence that encodes a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 20, and conservatively 
modified SEQ ID NO: 20. In other alternate embodiments, the nucleotide 

5 sequence may be that of degenerate variants of above-mentioned sequences. 
The invention also includes operably linking one or more expression control 
sequences to any of the above-mentioned nucleotide sequences. The 
invention also includes a cell comprising any of the above-mentioned 
nucleotide sequences operably linked to one or more expression control 

10 sequences. 

The present invention provides a substantially pure arrestin 1 
polypeptide that includes amino acid sequence that contains at least a 
conservatively modified identity with SEQ ID NO: 2 and binds to odorant 
receptors. The amino acid sequence of arrestin 1 protein can differ from SEQ 

15 ID NO: 2 by non-conservative substitutions, deletions, or insertions located 
at positions that do not destroy the function of the arrestin 1 polypeptide. In 
alternate embodiments, the polypeptide has an amino acid sequence 
consisting of SEQ ID NO: 2. The purified polypeptide is a polypeptide that 
binds specifically to an antibody that binds specifically to mosquito arrestin. 

20 In other alternate embodiments, the polypeptide comprises fragments of 
SEQ ID NO: 2, having at least 20 consecutive residues. 

The present invention also provides a substantially pure odorant 
receptor 1 polypeptide that includes amino acid sequence that contains at 
least a conservatively modified identity with SEQ ID NO: 4 and binds to 

25 arrestin. The amino acid sequence of odorant receptor 1 polypeptide can 
differ from SEQ ID NO: 4 by non-conservative substitutions, deletions, or 
insertions located at positions that do not destroy the function of the odorant 
receptor 1 polypeptide. In alternate embodiments, the polypeptide has an 
amino acid sequence consisting of SEQ ID NO: 4. In other alternate 

30 embodiments, the polypeptide comprises fragments of SEQ ID NO: 4, having 
at least 20 consecutive residues. 
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The present invention provides a substantially pure odorant receptor 
2 polypeptide that includes amino acid sequence that contains at least a 
conservatively modified identity with SEQ ID NO: 6 and binds to arrestin. 
The amino acid sequence of odorant receptor 2 polypeptide can differ from 
5 SEQ ID NO: 6 by non-conservative substitutions, deletions, or insertions 
located at positions that do not destroy the function of the odorant receptor 2 
polypeptide. In alternate embodiments, the polypeptide has an amino acid 
sequence consisting of SEQ ID NO: 6. In other alternate embodiments, the 
polypeptide comprises fragments of SEQ ID NO: 6, having at least 20 

10 consecutive residues. 

The present invention also provides a substantially pure odorant 
receptor 3 polypeptide that includes amino acid sequence that contains at 
least a conservatively modified identity with SEQ ID NO: 8 and binds to 
arrestin. The amino acid sequence of odorant receptor 3 polypeptide can 

15 differ from SEQ ID NO: 8 by non-conservative substitutions, deletions, or 
insertions located at positions that do not destroy the function of the odorant 
receptor 3 polypeptide. In alternate embodiments, the polypeptide has an 
amino acid sequence consisting of SEQ ID NO: 8. In other alternate 
embodiments, the polypeptide comprises fragments of SEQ ID NO: 8, having 

20 at least 20 consecutive residues. 

The present invention also provides a substantially pure odorant 
receptor 4 polypeptide that includes amino acid sequence that contains at 
least a conservatively modified identity with SEQ ID NO: 14 and binds to 
arrestin. The amino acid sequence of odorant receptor 4 polypeptide can 

25 differ from SEQ ID NO: 14 by non-conservative substitutions, deletions, or 
insertions located at positions that do not destroy the function of the odorant 
receptor 4 polypeptide. In alternate embodiments, the polypeptide has an 
amino acid sequence consisting of SEQ ID NO: 14. In other alternate 
embodiments, the polypeptide comprises fragments of SEQ ID NO: 14, 

30 having at least 20 consecutive residues. 

The present invention also provides a substantially pure odorant 
receptor 5 polypeptide that includes amino acid sequence that contains at 
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least a conservatively modified identity with SEQ ID NO: 16 and binds to 
arrestin. The amino acid sequence of odorant receptor 5 polypeptide can 
differ from SEQ ID NO: 16 by non-conservative substitutions, deletions, or 
insertions located at positions that do not destroy the function of the odorant 
5 receptor 5 polypeptide. In alternate embodiments, the polypeptide has an 
amino acid sequence consisting of SEQ ID NO: 16. In other alternate 
embodiments, the polypeptide comprises fragments of SEQ ID NO: 16, 
having at least 20 consecutive residues. 

The present invention also provides a substantially pure odorant 

10 receptor 6 polypeptide that includes amino acid sequence that contains at 
least a conservatively modified identity with SEQ ID NO: 18 and binds to 
arrestin. The amino acid sequence of odorant receptor 6 polypeptide can 
differ from SEQ ID NO: 18 by non-conservative substitutions, deletions, or 
insertions located at positions that do not destroy the function of the odorant 

15 receptor 6 polypeptide. In alternate embodiments, the polypeptide has an 
amino acid sequence consisting of SEQ ID NO: 18. In other alternate 
embodiments, the polypeptide comprises fragments of SEQ ID NO: 18, 
having at least 20 consecutive residues. 

The present invention also provides a substantially pure odorant 

20 receptor 7 polypeptide that includes amino acid sequence that contains at 
least a conservatively modified identity with SEQ ID NO: 20 and binds to 
arrestin. The amino acid sequence of odorant receptor 7 polypeptide can 
differ from SEQ ID NO: 20 by non-conservative substitutions, deletions, or 
insertions located at positions that do not destroy the function of the odorant 

25 receptor 7 polypeptide. In alternate embodiments, the polypeptide has an 
amino acid sequence consisting of SEQ ID NO: 20. In other alternate 
embodiments, the polypeptide comprises fragments of SEQ ID NO: 20, 
having at least 20 consecutive residues. 

The invention also provides an arrestin 1 antibody, which comprises 

30 polyclonal or monoclonal antibodies. The antibody can be conjugated to a 
detectable label. 
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Another aspect of the present invention provides an odorant receptor 
1 antibody, which comprises polyclonal or monoclonal antibodies. The 
antibody can be conjugated to a detectable label. Antibody labels and 
methods are well known in the art. 
5 The present invention also provides an odorant receptor 2 antibody, 

which comprises polyclonal or monoclonal antibodies. The antibody can be 
conjugated to a detectable label. 

Another aspect of the present invention provides an odorant receptor 

3 antibody, which comprises polyclonal or monoclonal antibodies. The 
10 antibody can be conjugated to a detectable label. 

Another aspect of the present invention provides an odorant receptor 

4 antibody, which comprises polyclonal or monoclonal antibodies. The 
antibody can be conjugated to a detectable label. 

Another aspect of the present invention provides an odorant receptor 
15 5 antibody, which comprises polyclonal or monoclonal antibodies. The 
antibody can be conjugated to a detectable label. 

Another aspect of the present invention provides an odorant receptor 

6 antibody, which comprises polyclonal or monoclonal antibodies. The 
antibody can be conjugated to a detectable label. 

20 Another aspect of the present invention provides an odorant receptor 

7 antibody, which comprises polyclonal or monoclonal antibodies. The 
antibody can be conjugated to a detectable label. 

The present invention also presents a method of producing arrestin 1 
protein. The method includes the following steps: (a) providing a cell 

25 transformed with an isolated DNA comprising a nucleotide sequence that 
encodes an amino acid sequence of SEQ ID NO: 2; (b) culturing the cell; and 
(c) collecting from the cell or the medium of the cell the polypeptide encoded 
by the polynucleotide sequence. Certain alternatives to SEQ ID NO: 2 are 
described above (e.g. conservative variants and hybridization variants). 

30 The present invention also provides a method of manufacturing 

odorant receptor 1 protein. The method includes the following steps: (a) 
providing a cell transformed with an isolated DNA comprising a nucleotide 
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sequence that encodes an amino acid sequence of SEQ ID NO: 4; (b) 
culturing the cell; and (c) collecting from the cell or the medium of the cell 
the polypeptide encoded by the polynucleotide sequence. 

The present invention provides a method of manufacturing odorant 

5 receptor 2 protein. The method includes the following steps: (a) providing a 
cell transformed with an isolated DNA comprising a nucleotide sequence 
that encodes an amino acid sequence of SEQ ID NO: 6; (b) culturing the cell; 
and (c) collecting from the cell or the medium of the cell the polypeptide 
encoded by the polynucleotide sequence. 

10 The present invention also provides a method of manufacturing 

odorant receptor 3 protein. The method includes the following steps: (a) 
providing a cell transformed with an isolated DNA comprising a nucleotide 
sequence that encodes an amino acid sequence of SEQ ID NO: 8; (b) 
culturing the cell; and (c) collecting from the cell or the medium of the cell 

15 the polypeptide encoded by the polynucleotide sequence. 

The present invention also provides a method of manufacturing 
odorant receptor 4 protein. The method includes the following steps: (a) 
providing a cell transformed with an isolated DNA comprising a nucleotide 
sequence that encodes an amino acid sequence of SEQ ID NO: 14; (b) 

20 culturing the cell; and (c) collecting from the cell or the medium of the cell 
the polypeptide encoded by the polynucleotide sequence. 

The present invention also provides a method of manufacturing 
odorant receptor 5 protein. The method includes the following steps: (a) 
providing a cell transformed with an isolated DNA comprising a nucleotide 

25 sequence that encodes an amino acid sequence of SEQ ID NO: 16; (b) 
culturing the cell; and (c) collecting from the cell or the medium of the cell 
the polypeptide encoded by the polynucleotide sequence. 

The present invention also provides a method of manufacturing 
odorant receptor 6 protein. The method includes the following steps: (a) 

30 providing a cell transformed with an isolated DNA comprising a nucleotide 
sequence that encodes an amino acid sequence of SEQ ID NO: 18; (b) 
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culturing the cell; and (c) collecting from the cell or the medium of the cell 
the polypeptide encoded by the polynucleotide sequence. 

The present invention also provides a method of manufacturing 
odorant receptor 7 protein. The method includes the following steps: (a) 
5 providing a cell transformed with an isolated DNA comprising a nucleotide 
sequence that encodes an amino acid sequence of SEQ ID NO: 20; (b) 
culturing the cell; and (c) collecting from the cell or the medium of the cell 
the polypeptide encoded by the polynucleotide sequence. 

The present invention also provides a method for identifying a 

10 mosquito olfaction molecule binding compound. The method includes the 
following steps: (a) providing an isolated mosquito olfaction molecule; (b) 
contacting a test agent with the isolated mosquito olfaction molecule; and 
(c) detecting whether the test agent is bound to the isolated mosquito 
olfaction molecule. Methods of detection are well known in the art. In 

15 certain embodiments, the isolated mosquito olfaction molecule further 
comprises a polypeptide having an amino acid sequence as set forth in 
SEQ ID NO: 2 or variants thereof as described herein (As used herein this 
statement means conservatively modified variants, hybridization variants, 
and variants to which antibodies bind specifically). In alternate 

20 embodiments, the isolated mosquito olfaction molecule further comprises a 
polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO: 2, SEQ ID NO. 4, SEQ ID NO. 6, SEQ ID NO. 8, 
SEQ ID NO. 14, SEQ ID NO. 16, SEQ ID NO. 18, SEQ ID NO. 20. 
conservatively modified SEQ ID NO: 4, conservatively modified SEQ ID 

25 NO: 6, conservatively modified SEQ ID NO: 8, conservatively modified 
SEQ ID NO: 14, conservatively modified SEQ ID NO: 16, conservatively 
modified SEQ ID NO: 18, and conservatively modified SEQ ID NO: 20. In 
other embodiments, contacting the test agent with the isolated mosquito 
olfaction molecule further comprises contacting under native conditions. 

30 In alternate embodiments, detecting specific binding of the test agent to 
the isolated mosquito olfaction molecule further comprises 
immunoprecipitation. 
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The present invention also presents a screening method for 
identifying a compound that inhibits binding of mosquito arrestin to a 
mosquito odorant receptor. The method includes the following steps: (a) 
providing an antibody that binds to an isolated mosquito olfaction 

5 molecule; (b) providing a mosquito olfaction molecule binding compound; 
(c) providing a test sample comprising the mosquito arrestin polypeptide 
and mosquito odorant receptor; (d) combining the mosquito olfaction 
molecule binding compound, the antibody, and the test sample in reaction 
conditions that allow a complex to form in the absence of the mosquito 

10 olfaction molecule binding compound, wherein the complex includes the 
antibody, mosquito arrestin and mosquito odorant receptor; and (e) 
determining whether the mosquito olfaction molecule binding compound 
decreases the formation of the complex, wherein a decrease indicates that 
the mosquito olfaction molecule binding compound is a compound that 

15 inhibits the binding of mosquito arrestin to mosquito odorant receptor. In 
certain embodiments, the mosquito odorant receptor further comprises a 
polypeptide having any of the following sequences: SEQ ID NO: 4, SEQ ID 
NO: 6, SEQ ID NO: 8, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, 
SEQ ID NO: 20, conservatively modified SEQ ID NO: 4, conservatively 

20 modified SEQ ID NO: 6, conservatively modified SEQ ID NO: 8, 
conservatively modified SEQ ID NO: 16, conservatively modified SEQ ID 
NO: 18, conservatively modified SEQ ID NO: 20 or conservatively modified 
SEQ ID NO: 14. 

Various features and advantages of the invention will be apparent 
25 from the following detailed description and from the claims. 

FIG. 1 is the nucleotide sequence (SEQ ID NO: 1) of arrestin 1 
isolated from Anopheles gambiae. 

FIG. 2 is the deduced amino acid sequence of arrestin 1 isolated from 
Anopheles gambiae (SEQ ID NO: 2). 
30 FIG. 3a-b are the nucleotide sequence (SEQ ID NO: 9) and deduced 

amino acid sequence (SEQ ID NO: 4) of odorant receptor 1 isolated from 
Anopheles gambiae. 
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FIG. 4a-b are the nucleotide sequence (SEQ ID NO: 10) and deduced 
amino acid sequence (SEQ ID NO: 6) of odorant receptor 2 isolated from 
Anopheles gambiae. 

FIG. 5a-b are the nucleotide sequence (SEQ ID NO: 11) and deduced 
5 amino acid sequence (SEQ ID NO: 8) of odorant receptor 3 isolated from 
Anopheles gambiae. 

FIG. 6a-b are the nucleotide sequence (SEQ ID NO: 13) and deduced 
amino acid sequence (SEQ ID NO: 14) of odorant receptor 4 isolated from 
Anopheles gambiae. 

10 FIG. 7 is a table of preferred codons used to deduce amino acid 

sequences from nucleotide sequences for Anopheles gambiae. 

FIG. 8 is a table listing cDNA and polypeptide sequences with 
corresponding SEQ ID numbers and Figure numbers. 

FIG. 9a-b are the nucleotide sequence (SEQ ID NO: 21) and deduced 
15 amino acid sequence (SEQ ID NO: 16) of odorant receptor 5 isolated from 
Anopheles gambiae. 

FIG. lOa-b are the nucleotide sequence (SEQ ID NO: 22) and deduced 
amino acid sequence (SEQ ID NO: 18) of odorant receptor 6 isolated from 
Anopheles gambiae. 

20 FIG. lla-b are the nucleotide sequence (SEQ ID NO: 23) and deduced 

amino acid sequence (SEQ ID NO: 20) of odorant receptor 7 isolated from 
Anopheles gambiae. 

BEST MODE FOR CARRYING OUT THE INVENTION 

25 Arrestins interact with odorant receptors to cause changes in 

cellular function. Interruption of normal arrestin function will lead to over 
stimulation of the olfaction system. Consequently, substances that block 
the arrestin - odorant receptor interaction can interfere with a mosquito's 
ability to home in on sources of bloodmeal, such as humans. Screening for 

30 substances that modulate arrestin - odorant receptor interaction is 
therefore useful for identifying pest control agents and for treatment of 
malaria. The deduced amino acid sequence and arrestin contains several 
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domains implicated in arrestin function. The motifs potention consensus 
Src homology 3 (SH3) binding sites. Cohen, et al., 1995, Cell, 80:237. 
Sequence comparisons with the DDBJ/EMBL/GenBank and SWISSPROT 
databases were performed using the GCG software. Devereux, et aL, 
5 1984, Nucleic Acids Res., 12:387-395. Protein alignment was also 
performed using the Clustal W software package. Thompson, et aL, 1994, 
Nucleic Acids Res, 22:4673-4680. Additionally, arrestin has been 
submitted to the GenBank database with accession No. AY017417. 

As used herein, "native conditions" means natural conditions as 

10 found within the ordinary conditions found within Anopheles gambiae. 

As used herein, "stringent conditions" means the following: 
hybridization at 42° C in the presence of 50% formamide; a first wash at 
65° C with about 2 x SSC containing 1% SDS; followed by a second wash at 
65 0 C with 0.1 x SSC. Salt concentrations and temperature may be 

15 modified. Such modifications may be found in Sambrook et aL, 1989, 
Molecular Cloning: A Laboratory Manual (2nd Edition), Cold Spring 
Harbor Press, Cold Spring Harbor, N.Y. The hybridizing part of the 
nucleic acid is generally at least 15 nucleotides in length. 

As used herein, "purified polypeptide" means a polypeptide that is 

20 substantially free from compounds normally associated with the 
polypeptide in the natural state. The absence of such compounds may be 
determined by detection of protein bands subsequent to SDS-PAGE. 
Purity may also be assessed in other ways known to those of ordinary skill 
in the art. The term, as defined herein, is not intended to exclude (1) 

25 synthetic or artificial combinations of the polypeptides with other 
compounds, (2) polypeptides having minor impurities which do not 
interfere with biological activity. 

As used herein, "isolated polynucleotide" means a polynucleotide 
having a structure that is not identical to any naturally occurring nucleic 

30 acid or of any fragment of a naturally occurring genomic nucleic acid 
spanning more than three separate genes. Thus, the term includes (1) a 
nucleic acid incorporated into a vector or into the genomic DNA of a 
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prokaryote or eukaryote in a manner such that the resulting molecule is 
not identical to any naturally occurring vector or genomic DNA; (2) a 
separate molecule of a cDNA, a genomic fragment, a fragment produced by 
polymerase chain reaction (PCR), or a restriction fragment; and (3) a 

5 recombinant nucleotide sequence that is part of a gene encoding a fusion 
protein. This definition of "isolated polynucleotide" supersedes and 
controls all other definitions known in the art. 

As used herein, "hybridization probe" means nucleic acid that is 
labeled for detection, such. as labeling with radiation. Hybridization probes 

10 are well known in the art. 

As used herein, "culturing the cell" means providing culture 
conditions that are conducive to polypeptide expression. Such culturing 
conditions are well known in the art. 

"As used herein, "operably linked" means incorporated into a genetic 

15 construct so that expression control sequences effectively control 
expression of a gene of interest. 

As used herein, "protein" means any peptide-linked chain of amino 
acids, regardless of length or post-translational modification, e.g., 
glycosylation or phosphorylation. 

20 As used herein, "sequence identity" means the percentage of 

identical subunits at corresponding positions in two sequences when the 
two sequences are aligned to maximize subunit matching, i.e., taking into 
account gaps and insertions. When a subunit position in both of the two 
sequences is occupied by the same monomeric subunit, e.g., if a given 

25 position is occupied by an adenine in each of two DNA molecules, then the 
molecules are identical at that position. For example, if 7 positions in a 
seqtience 10 nucleotides in length are identical to the corresponding 
positions in a second 10-nucleotide sequence, then the two sequences have 
70% sequence identity. Preferably, the length of the compared sequences 

30 is at least 60 nucleotides, more preferably at least 75 nucleotides, and 
most preferably 100 nucleotides. Sequence identity is typically measured 
using sequence analysis software (e.g., Sequence Analysis Software 
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Package of the Genetics Computer Group, University of Wisconsin 
Biotechnology Center, 1710 University Avenue, Madison, Wis. 53705). 

As used herein, "mosquito olfaction molecule" means a polypeptide 
that is involved in the modulation of the mosquito olfaction system. By 

5 way of illustration, and not limitation, mosquito olfaction molecules have 
the following characteristics: (1) G protein-coupled seven-transmembrane 
domain receptors, (2) sequence conservation regarding positions of a 
subset of introns and the length of the deduced protein, (3) they are 
selectively expressed in olfactory receptor neurons, and (4) they have 

10 highly conserved structural motifs. Odorant receptors 3, 4 and 5 are 
clustered tightly together within the A. gambaie genome. Odorant 
receptor 5 and odorant receptor 4 are separated by 310 bp while odorant 
receptor 4 and odorant receptor 3 are separated by 747 bp. An additional 
characteristic of odorant and taste receptor genes is the close chromosomal 

15 linkage. Such linkage has been demonstrated in the D, melanogaster and 
odorant receptor genes from C. elegans and mouse. Clyne, et aL, 1999, 
Neuron, 22:327-338; Vosshall, et aL, 1999, Cell, 96:725-736; Vosshall, et 
aL, 2000, Cell, 102:147-159; Clyne, et aL, 2000, Science, 287:1830-1834; 
Gao and Chess 1999, Genomics, 60:31-39; Troemel, et aL, 1995, Cell, 

20 83:207-218; Xie, et aL, 2000, Genome, 11:1070-1080. Fox et aL, 2001, 
PNAS 98:14693-14697. This group of molecules includes odorant receptor 
1 (SEQ ID NO: 4), odorant receptor 2 (SEQ ID NO: 6), odorant receptor 3 
(SEQ ID NO: 8), odorant receptor 4 (SEQ ID NO: 14), odorant receptor 5 
(SEQ ID NO: 16), odorant receptor 6 (SEQ ID NO: 18), odorant receptor 7 

25 (SEQ ID NO: 20), arrestin 1 (SEQ ID NO: 2) and variants thereof as 
described herein. 

As used herein, "odorant receptor" means any molecule performing 
the functional role of an odorant receptor, as described herein and in the 
scientific literature. Examples of odorant receptors included, but are not 

30 limited to, odorant receptor 1, odorant receptor 2, odorant receptor 3, 
odorant receptor 4, odorant receptor 5, odorant receptor 6, and odorant 
receptor 7. 
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As used herein, "mosquito olfaction molecule binding compound" 
means a compound that specifically binds to a mosquito olfaction molecule. 
Mosquito olfaction molecules additionally include polypeptides having the 
characteristics noted in the definition of the term. 

5 As used herein, "mosquito olfaction molecule-specific antibody" 

means an antibody that binds to a mosquito olfaction molecule. The term 
includes polyclonal and monoclonal antibodies. 

As used herein, "substantially pure protein" means a protein 
separated from components that naturally accompany it. Typically, the 

10 protein is substantially pure when it is at least 60%, by weight, free from 
the proteins and other naturally-occurring organic molecules with which it 
is naturally associated. In certain embodiments, the purity of the 
preparation is at least 75%, more preferably at least 90%, 95% and most 
preferably at least 99%, by weight. A substantially pure mosquito olfaction 

15 molecule protein can be obtained, for example, by extraction from a 
natural source, by expression of a recombinant nucleic acid encoding a 
mosquito olfaction molecule polypeptide, or by chemical synthesis. Purity 
can be measured by any appropriate method, e.g., column 
chromatography, polyacrylamide gel electrophoresis, or HPLC analysis. A 

20 chemically-synthesized protein or a recombinant protein produced in a cell 
type other than the cell type in which it naturally occurs is, by definition, 
substantially free from components that naturally accompany it. 
Accordingly, substantially pure proteins include those having sequences 
derived from eukaryotic organisms but synthesized in E. coli or other 

25 prokaryotes. 

As used herein, "fragment", as applied to a polypeptide (e.£. 5 
arrestin 1 polypeptide), means at least about 10 amino acids, usually 
about 20 contiguous amino acids, preferably at least 40 contiguous amino 
acids, more preferably at least 50 amino acids, and most preferably at 

30 least about 60 to 80 or more contiguous amino acids in length. Such 
peptides can be generated by methods known to those skilled in the art, 
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including proteolytic cleavage of the protein, de novo synthesis of the 
fragment, or genetic engineering. 

As used herein, "test sample" means a sample that contains arrestin 
1, or conservatively modified variant thereof, in combination with at least 
5 one of the following: odorant receptor 1, odorant receptor 2, odorant 
receptor 3, odorant receptor 5, odorant receptor 6, odorant receptor 7, 
odorant receptor 4, conservatively modified variants of the above, or other 
odorant receptors known in the art. 

As used herein, "vector" means a replicable nucleic acid construct, 

10 e.g., a plasmid or viral nucleic acid. Preferably, expression is controlled by 
an expression control sequence. 

As used herein, "conservatively modified 5 ' applies to both amino acid 
and nucleic acid sequences. Regarding nucleic acid sequences, 
conservatively modified refers to those nucleic acids which encode 

15 identical or conservatively modified variants of the amino acid sequences. 
Because of the degeneracy of the genetic code, a large number of 
functionally identical nucleic acids encode any given protein. For example, 
the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. 
Thus, at every position where an alanine is specified by a codon, the codon 

20 can be altered to any of the corresponding codons described without 
altering the encoded polypeptide. Every nucleic acid sequence herein 
which encodes a polypeptide also describes every possible silent variation 
of the nucleic acid. One of ordinary skill will recognize that each codon in a 
nucleic acid (except AUG, which is ordinarily the only codon for 

25 methionine; and UGG, which is ordinarily the only codon for tryptophan) 
can be modified to yield a functionally identical molecule. Accordingly, 
each silent variation of a nucleic acid which encodes a polypeptide of the 
present invention is implicit in each described polypeptide sequence and 
incorporated herein by reference. 

30 As to amino acid sequences, one of skill will recognize that 

individual substitutions, deletions or additions to a nucleic acid, peptide, 
polypeptide, or protein sequence which alters, adds or deletes a single 
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amino acid or a small percentage of amino acids in the encoded sequence is 
a "conservatively modified variant" where the alteration results in the 
substitution of an amino acid with a chemically similar amino acid. Thus, 
any number of amino acid residues selected from the group of integers 
5 consisting of from 1 to 15 can be so altered. Thus, for example, 1, 2, 3, 4, 5, 
7, or 10 alterations can be made. Conservatively modified variants 
typically provide similar biological activity as the unmodified polypeptide 
sequence from which they are derived. For example, substrate specificity, 
enzyme activity, or ligand/receptor binding is generally at least 30%, 40%, 

10 50%, 60%, 70%, 80%, or 90% of the native protein for it's native substrate. 
Conservative substitution tables providing functionally similar amino 
acids are well known in the art. The following six groups each contain 
amino acids that are conservative substitutions for one another: 1) Alanine 
(A), Serine (S), Threonine (T); 2) Aspartic acid (D), Glutamic acid (E); 3) 

15 Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine 
(I), Leucine (L), Methionine (M), Valine (V); and 6) Phenylalanine (F), 
Tyrosine (Y), Tryptophan (W). See also, Creighton (1984) Proteins W.H. 
Freeman and Company. 

As used herein, "immunogenic fragment" means the fragment of a 

20 polypeptide that is capable of eliciting an immunogenic response. 

Unless otherwise defined, all technical and scientific terms used 
herein have the same meaning as commonly understood by one of ordinary 
skill in the art to which this invention pertains. Although methods and 
materials similar or equivalent to those described herein can be used in 

25 the practice or testing of the present invention, the preferred methods and 
materials are described below. All publications, patent applications, 
patents, and other references mentioned herein are incorporated by 
reference in their entirety. In case of conflict, the present document, 
including definitions, will control. Unless otherwise indicated, materials, 

30 methods, and examples described herein are illustrative only and not 
intended to be limiting. 
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Structure and Function 

The genes disclosed herein have homology to corresponding arrestin 
and odorant receptor Drosophila melanogaster genes. Fox, et al. } 200 1, 
PNAS 98:14693-14697. The genes disclosed herein have the utility disclosed 
5 within this patent application. 

A full-length Anopheles gambiae arrestin 1 cDNA has been cloned 
and sequenced. The arrestin 1 cDNA clone contains 1964 bp and includes 
a complete open reading frame that encodes a protein 383 amino acids in 
length, as seen in Figure 1. The open reading frame from the methionine 
10 includes 383 amino acids, yielding a slightly basic polypeptide (PI=8.0) 
with a predicted molecular weight of 42.8 KD. 

A full-length Anopheles gambiae odorant receptor 1 genomic DNA 
has been sequenced. The odorant receptor 1 genomic DNA contains 3895 
bp and includes a deduced open reading frame that encodes a protein 394 
15 amino acids in length. 

A full-length Anopheles gambiae odorant receptor 2 genomic DNA 
has been sequenced. The odorant receptor 2 genomic DNA contains 4985 
bp and includes a deduced open reading frame that encodes a protein 380 
amino acids in length. 
20 A full-length Anopheles gambiae odorant receptor 3 genomic DNA 

has been sequenced. The odorant receptor 3 genomic DNA contains 2083 
bp and includes a deduced open reading frame that encodes a protein 411 
amino acids in length. 

A full-length Anopheles gambiae odorant receptor 4 genomic DNA 
25 has been sequenced. The odorant receptor 4 genomic DNA contains 2374 
bp and includes a deduced open reading frame that encodes a protein 394 
amino acids in length. 

A full-length Anopheles gambiae odorant receptor 5 genomic DNA 
has been sequenced. The odorant receptor 5 genomic DNA contains 2272 
30 bp and includes a deduced open reading frame that encodes a protein 391 
amino acids in length. 
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A partial Anopheles gambiae odorant receptor 6 genomic DNA has 
been sequenced. The odorant receptor 6 genomic DNA contains 931 bp and 
includes a deduced open reading frame that encodes a protein 157 amino 
acids in length. 

5 A full-length Anopheles gambiae odorant receptor 7 genomic DNA 

has been sequenced. The odorant receptor 7 genomic DNA contains 11,103 
bp and includes a deduced open reading frame that encodes a protein 401 
amino acids in length. 

10 Expression Control Sequences and Vectors 

The mosquito olfaction molecules of this invention can be used in a 
method to identify a mosquito olfaction molecule binding compound. If 
desired, the mosquito olfaction molecule binding compounds may be 
further tested for ability to inhibit binding of arrestin to an odorant 

15 receptor. Methods for this test are described herein. In certain 
embodiments, the DNA that encodes the arrestin 1 polypeptide ("ARRl 
DNA") may be cloned into an expression vector, i.e., a vector wherein 
ARRl DNA is operably linked to expression control sequences. The need 
for expression control sequences will vary according to the type of cell in 

20 which the ARRl DNA is to be expressed. Generally, expression control 
sequences include a transcriptional promoter, enhancer, suitable mRNA 
ribosomal binding sites, and sequences that terminate transcription and 
translation. One of ordinary skill in the art can select proper expression 
control sequences. Standard methods can be used by one skilled in the art 

25 to construct expression vectors. See generally, Sambrook et al. 7 1989, 
Molecular Cloning: A Laboratory Manual (2nd Edition), Cold Spring 
Harbor Press, Cold Spring Harbor, N.Y. Vectors useful in this invention 
include, but are not limited to plasmid vectors and viral vectors. 

All other nucleic acid sequences disclosed herein may also be 

30 operably linked to expression control sequences. The expression control 
sequences described above may be used. As mentioned above, methods 
known to those of ordinary skill in the art may be used to insert nucleic 
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acid sequences into expression control sequences. Methods known to those 
of ordinary skill in the art may be used to introduce the nucleic acid and 
expression control sequence into eukaryotic and/or prokaryotic cells. An 
example of prokaryotic cells is BL21 (DE3)pLysS bacteria. An example of 
5 eukaryotic cells is Sf9. 

In certain embodiments of the invention, ARR1 DNA is introduced 
into, and expressed in, a prokaryotic cell, e.g., BL21 (DE3)pLysS bacteria. 

In certain embodiments of the invention, the ARR1 DNA is 
introduced into, and expressed in, a eukaryotic cell in vitro. Eukaryotic 
10 cells useful for expressing ARR1 DNA in vitro include, but are not limited 
to Sf9 cells. Transfection of the eukaryotic cell can be transient or stable. 

Mosquito Olfaction Molecule -Specific Antibody 

An animal is immunized with a mosquito olfaction molecule {e.g., 
15 arrestin 1 polypeptide). The animal produces antibodies to the mosquito 
olfaction molecule. The production and collection of the polyclonal 
antibodies was performed by Lampire Biological Laboratories, Inc. of 
Pipersville, PA 18947, using techniques known in the art. 

20 

Mosquito Olfaction Molecule Antibody Label 

In some embodiments of the invention, the mosquito olfaction 
molecule-specific antibody includes a detectable label. Many detectable 
labels can be linked to, or incorporated into, an antibody of this invention. 
25 The following are examples of useful labels: radioactive, non-radioactive 
isotopic, fluorescent, chemiluminescent, paramagnetic, enzyme, or 
colorimetric. 

Examples of useful enzyme labels include malate hydrogenase, 
staphylococcal dehydrogenase, delta-5-steroid isomerase, alcohol 
30 dehydrogenase, alpha-glycerol phosphate dehydrogenase, triose phosphate 
isomerase, peroxidase, alkaline phosphatase, asparaginase, glucose 
oxidase, beta-galactosidase, ribonuclease, urease, catalase, glucose-6- 
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phosphate dehydrogenase, and gluco amylase, acetylcholinesterase. 
Examples of useful radioisotopic labels include 3 H, 131 1, 12 & I, 32 p ? 35 s, and 
14 C. Examples of useful fluorescent labels include fluorescein, rhodamine, 
phycoerythrin, phycocyanin, allophycocyanin, and fluorescamine. 
5 Examples of useful chemiluminescent label types include luminal, 
isoluminal, aromatic acridinium ester, imidazole, acridinium salt, oxalate 
ester, luciferin, luciferase, and aequorin. 

Antibody labels can be coupled to, or incorporated into antibodies by 
use of common techniques known to those of ordinary skill in the art. 
10 Typical techniques are described by Kennedy et aL, 1976, Clin. Chim, 
Acta, 70:1-31; and Schurs et aL, 1977, Clin. Chim. Acta, 81: 1-40. Useful 
chemical coupling methods include those that use glutaraldehyde, 
periodate, dimaleimide and m-maleimido-benzyl-N-hydroxy-succinimide 
ester. 

15 

Screening assays 

The present invention provides, in part, a screen for mosquito 
olfaction molecule binding compounds with the ability to interrupt the 
interaction of arrestin with an odorant receptor. Identifying that a test 

20 agent will bind a mosquito olfaction molecule is one part. Once a test 
agent has demonstrated its ability to bind a mosquito olfaction molecule, it 
is properly called a mosquito olfaction molecule binding compound. Since 
it is possible for a mosquito olfaction molecule binding compound to bind 
without necessarily interrupting the arrestin-odorant receptor interaction, 

25 it is proper to further assay in order to determine that the interaction is 
disrupted. The ability of the mosquito olfaction molecule binding 
compound to interrupt the arrestin-odorant receptor interaction may be 
assayed. 

In certain embodiments, a test agent is identified as a mosquito 
30 olfaction molecule binding compound by the following method. One of the 
mosquito olfaction molecules is immobilized (e.g., arrestin 1). Polypeptides 
can be immobilized using methods known in the art. Such methods include 
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the use of Affigel (Biorad) or activated agarose or sepharose to which 
significant amounts of polypeptides can be directly coupled. The 
immobilized polypeptide (e.g., arrestin 1) is contacted with the test agent. 
Unbound test agent can be removed by washing with binding buffer. Then, 
5 the bound test agent is eluted by a salt gradient. The material that is 
bound to the immobilized polypeptide may be purified by SDS-PAGE. 
Other methods known by one of ordinary skill in the art for identifying an 
interaction between two proteins include affinity purification, co- 
immunoprecipitation, and far-western blotting. 

10 In certain embodiments, the following method is used to screen for 

substances capable of interrupting arrestin-odorant receptor interaction. 
The following method of detecting protein-protein interaction will also 
provide information regarding the lack of protein-protein interactions. The 
two-hybrid method is a well known genetic assay used to detect protein- 

15 protein interactions in vivo. See, e.g., Bartel et aL, 1993, In Cellular 
Interactions in Development: A Practical Approach, Oxford University 
Press, Oxford, pp. 153-179; Chien et aL, 1991, Proc. Natl. Acad. Sci. USA, 
88:9578-9582; Fields et aL, 1989, Nature, 340:245-247; Fritz et al., 1992, 
Curr. Biol., 2:403-405; Guarente, L., 1993, Proc. Natl. Acad. Sci. USA, 

20 90:1639-1641. There are multiple combinations available between arrestin 
and the seven odorant receptors. A GAL4 binding domain is linked to an 
arrestin fragment (e.g., arrestin 1 polypeptide) and a GAL4 
transactivation domain is linked to an odorant receptor fragment (e.g., 
odorant receptor 1 polypeptide). A GAL4 binding site is linked to a 

25 reporter gene such as lacZ. All three elements are contacted in the 
presence and absence of a mosquito olfaction molecule binding compound. 
The level of expression of the reporter gene is monitored. A decrease in the 
level of expression of lacZ means that the mosquito olfaction molecule 
binding compound interrupts the interaction of arrestin with the odorant 

30 receptor. 

In an alternate embodiment, the following is a method that will 
identify whether a mosquito olfaction molecule binding compound will 
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interrupt the interaction between arrestin and an odorant receptor. The 
following method of co-immunoprecipitation may make use of the 
available panel of antibodies to any arrestin or odorant receptor. Since this 
method makes use of antibodies that demonstrate the ability to 

5 immunoprecipitate the mosquito olfaction molecule and other proteins to 
which it is bound, the ability of a mosquito olfaction molecule binding 
compound to inhibit the interaction of the mosquito olfaction molecule will 
serve as the measure of the compound's interruption ability. 

Also disclosed herein is a method of modulating arrestin 1 biological 

10 activity. In certain embodiments, the method comprises administering an 
arrestin 1 biological activity-modulating amount of a mosquito olfaction 
molecule binding compound. Upon administration, arrestin 1 is contacted 
with the mosquito olfaction molecule binding compound. Such contact 
results in modulating arrestin 1 biological activity. The mosquito olfaction 

15 molecule binding compound may be administered as an aerosol, solid, or 
liquid, such that delivery occurs through contact with the body of the 
target subject. For example, administration may occur by absorption 
through the exterior surfaces of the target subject, i.e., mosquitoes, or by 
intake through other apertures of the target subject [proboscis (or other 

20 feeding aperture), or spiracles (or other respiratory apertures]. An 
activity-modulating amount of mosquito olfaction molecule binding 
compound is an amount that is sufficient to prohibit at least about 50% of 
the arrestin 1 (SEQ ID NO: 2) molecules from interacting with any 
odorant receptors. 

25 All citations and references described in this patent application are 

hereby incorporated herein by reference, in their entirety. Also 
incorporated in this specification are the exhibits filed herewith. The 
present invention is further illustrated by the following specific examples. 
The examples are provided for illustration only and are not to be 

30 construed as limiting the scope or content of the invention in any way. 
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Example 1 
Protein expression 

A cDNA encoding arrestin 1 is subcloned into the pBlueScript II 
(KS) vector (Novagen, Madison, WI) at the BamHI/Ndel restriction sites 
5 for DNA sequencing. The cDNA encoding arrestin 1 is subsequently 
subcloned into the bacterial expression plasmid pET15b (Novagen, 
Madison, WI). The bacterial expression plasmid containing the arrestin 1 
cDNA is transformed into BL21 (DE3)pLysS bacteria (Novagen, Madison, 
WI) for high levels of arrestin 1 expression. Methods are known in the art 
10 for isolating the expressed protein. 

Expression of other nucleic acids disclosed herein is achieved by 
using the above-referenced method. Once the odorant receptor is in 
protein form, it may be used as described within this application. 
Example 2 

15 Mosquito Olfaction Molecule Specific Antibody 

The cDNA encoding arrestin 1 is subcloned into the bacterial 
expression plasmid pET15b (Novagen, Madison, WI). The vector is 
transformed into BL21 (DE3)pLysS bacteria (Novagen, Madison, WI) for 
high levels of arrestin 1 expression. Rapid purification is performed using 

20 His-Bind affinity Resin (Novagen, Madison, WI). Native recombinant 
arrestin 1 is then denatured using gel purification on SDS-polyacrylamide 
gel electrophoresis followed by staining with 0.05% Coomassie Brilliant 
Blue (Sigma-Aldrich, St. Louis, MO). Polyclonal antibodies were 
generated in rabbits by Lampire Biological Laboratories, Inc. of 

25 Pipersville, PA 18947. Polyclonal antibodies may be generated for any of 
the odorant receptors disclosed herein. 
Example 3 

Identification of a mosquito olfaction molecule binding compound 

Arrestin 1 polypeptide is expressed in and purified from BL21 
30 (DE3)pLysS bacteria (Novagen, Madison, WI). Arrestin 1 is incubated 
with a test agent in Phosphate Buffered Saline (pH 7.5), 0.1% Tween-20, 
and 0.1% broad spectrum protease inhibitors for 90 minutes at 4° C. Anti- 
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arrestin 1 polyclonal sera is added to the reaction at a dilution of 1:2000 
and incubated for an additional 60 minutes. The complexes, consisting of 
either polypeptide-antibody or test age nt-polypeptide- antibody are isolated 
by the addition of lx 10 7 Dynalbeads M280 (sheep anti-Rabbit IgG) 

5 followed by incubation at the same temperature for an additional 60 
minutes. Isolation of the complexes is completed by using the DYNAL 
Magnetic Particle Concentrator (Dynal Inc., Lake Success, NY). The 
complexes are washed three times with broad spectrum protease 
inhibitors. Content of the complexes is assayed by SDS-PAGE followed by 

10 silver staining and western blotting. Common methods are known by 
those of ordinary skill in the art for silver staining and western blotting. 
See generally, Sambrook et aL, 2001, Molecular Cloning: A Laboratory 
Manual (3rd Edition), Cold Spring Harbor Press, Cold Spring Harbor, N.Y. 
Obviously, the presence of the test agent, polypeptide, and antibody 

15 indicates that the test agent binds to the polypeptide. 

Example 4 

Identification of a compound that inhibits binding of arrestin to 
an odorant receptor 

20 Arrestin 1 polypeptide and odorant receptor 1 polypeptide are 

expressed in and purified from BL21 (DE3)pLysS bacteria (Novagen, 
Madison, WI). Arrestin 1 polypeptide and odorant receptor 1 polypeptide 
are incubated with a mosquito olfaction molecule binding compound in 
Phosphate Buffered Saline (pH 7.5), 0.1% Tween-20, and 0.1% broad 

25 spectrum protease inhibitors for 90 minutes at 4° C. Anti-arrestin 1 
polyclonal sera is added to the reaction at a dilution of 1:2000 and 
incubated for an additional 60 minutes. The complexes, consisting of 
either antibody-arrestin 1-odorant receptor 1 or antibody-arrestin 1, are 
isolated by the addition of lx 10 7 Dynalbeads M280 (sheep anti-Rabbit 

30 IgG) followed by incubation at the same temperature for an additional 60 
minutes (Dynal Inc., Lake Success, NY). Once the isolation of the 
complexes is completed by using the DYNAL Magnetic Particle 
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Concentrator, (Dynal Inc., Lake Success, NY), the complexes are washed 
three times with broad spectrum protease inhibitors. The content of the 
complexes is assayed by SDS-PAGE followed by silver staining and 
western blotting. Common methods are known by those of ordinary skill in 
5 the art for silver staining and western blotting. See generally, Sambrook et 
al, 2001, Molecular Cloning: A Laboratory Manual (3rd Edition), Cold 
Spring Harbor Press, Cold Spring Harbor, N.Y. 
Example 5 

Far western blotting to analyze components of a protein mixture 

10 The protein sample is fractionated on an SDS-PAGE gel. After 

electrophoresis at a voltage and time that is known in the art, the proteins 
are transferred from the gels onto a solid support membrane by 
electroblotting. Transferred membranes may be stained with Ponceau S to 
facilitate location and identification of specific proteins. Nonspecific sites 

15 on the membranes are blocked with standard blocking reagents, and the 
membranes are then incubated with a radiolabeled non-antibody protein 
probe. After washing, proteins that bind to the probe are detected by 
autor adio gr aphy . 

The content of the solutions used within this protocol are disclosed 

20 in Wiley's Current Protocols in Cell Biology. 

The protein sample to be analyzed is resuspended in lx SDS sample 
buffer. Approximately 50 to 100 ug can be loaded in each lane of the gel. 
The samples are separated with SDS-PAGE. The proteins are transferred 
to nitrocellulose by electroblotting. 

25 After transfer, stain the membrane for 5 min in -100 ml freshly 

diluted lx Ponceau S staining solution. The membrane is then destained 
by washing it in several changes of deionized water until the proteins are 
clearly visible. Continue to destain for an additional 5 min in water until 
the red staining fades. 

30 The membrane is then blocked for 2 hr in 200 ml blocking buffer I 

at room temperature with gentle agitation. Incubate the membrane in 200 
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ml of blocking buffer II for 2 hours and rinse the membrane briefly in 100 
ml of 1 x PBS. 

Prior to probing, the membrane is preincubated for 10 min in 50 ml 
of lx probe dilution buffer without the probe at room temperature. The 

5 probe is added to the membrane and incubated for 2 hours at room 
temperature. The membrane is washed with 200 ml lx PBS for 5 min, 
room temperature. Repeat the wash step three additional times. Air dry 
the filter and expose to x-ray film with intensifying screen. An overnight 
exposure is typically sufficient. 

0 The present invention is not limited by mechanism or theory. 

Although there have been described general and specific embodiments of 
the invention herein, these embodiments do not limit the scope of the 
invention except as set forth in the claims below. 
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CLAIMS 

What is claimed is 

1. A method of identifying an agent that binds to mosquito olfaction 
5 molecules, comprising: 

a) providing an isolated mosquito olfaction molecule; 

b) contacting a test agent with the isolated mosquito olfaction 
molecule; and 

c) detecting specific binding of the test agent to the isolated 
10 mosquito olfaction molecule, 

wherein the presence of specific binding identifies the test agent as a 
mosquito olfaction molecule binding compound. 

2. The method of claim 1, wherein the isolated mosquito olfaction 
15 molecule further comprises a polypeptide having an amino acid sequence 

selected from the group consisting of SEQ ID NO: 2, SEQ ID NO. 4, SEQ 
ID NO. 6, SEQ ID NO. 8, SEQ ID NO. 14 SEQ ID NO. 16, SEQ ID NO. 
18, and SEQ ID NO. 20. 

20 3. The method of claim 1, wherein contacting the test agent with the 
isolated mosquito olfaction molecule further comprises contacting under 
native conditions. 

4. The method of claim 1, wherein detecting specific binding of the test 
25 agent to the isolated mosquito olfaction molecule further comprises 

immunoprecipitation. 

5. The method of claim 4, wherein the isolated mosquito olfaction 
molecule comprises a polypeptide selected from a group consisting of : SEQ 

30 ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 14, 
SEQ ID NO: 16, SEQ ID NO: 18, and SEQ ID NO: 20. 
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6. The method of claim 4, wherein isolated mosquito olfaction molecule 
comprises a polypeptide selected from a group consisting of: conservatively 
modified SEQ ID NO: 2, conservatively modified SEQ ID NO: 4, 
conservatively modified SEQ ID NO: 6, conservatively modified SEQ ID 

5 NO: 8, conservatively modified SEQ ID NO: 14, conservatively modified 
SEQ ID NO: 16, conservatively modified SEQ ID NO: 18, and 
conservatively modified SEQ ID NO: 20. 

7. A method of identifying a compound that inhibits binding of a 
10 mosquito arrestin to a mosquito odorant receptor, comprising: 

providing an antibody that binds to an isolated mosquito olfaction 
molecule; 

providing a mosquito olfaction molecule binding compound; 
providing a test sample; 

15 combining the mosquito olfaction molecule binding compound, the 

antibody, and the test sample in reaction conditions that allow a complex 
to form in the absence of the mosquito olfaction molecule binding 
compound, wherein the complex includes the mosquito arrestin and the 
mosquito odorant receptor; and 

20 determining whether the mosquito olfaction molecule binding 

compound decreases the formation of the complex, wherein a decrease 
indicates that the mosquito olfaction molecule binding compound is a 
compound that inhibits the binding of mosquito arrestin to mosquito 
odorant receptor. 

25 

8. The method of claim 7, wherein 2-hybrid analysis is used to identify 
a compound that inhibits the binding of mosquito arrestin to a mosquito 
odorant receptor. 

30 9. The method of 8, wherein a GAL4 binding domain is linked to an 
arrestin fragment. 
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10. The method of claim 9, wherein a GAL4 trans activation domain is 
linked to an odorant receptor fragment. 

11. The method of claim 7, wherein co-immunoprecipitation is used to 
5 determine whether the mosquito olfaction molecule binding compound 

decreases the formation of the complex. 

12. The method of claim 11, wherein the antibody binds to a 
polypeptide having an amino acid sequence selected from the group 

10 consisting of SEQ ID NO 2 and conservatively modified SEQ ID NO 2. 

13. An isolated polynucleotide comprising a sequence selected from the 
group consisting of: 

a nucleotide sequence encoding a polypeptide comprising an amino 
15 acid sequence of SEQ ID NO: 2; 

a nucleotide sequence encoding a polypeptide comprising at least 20 
consecutive residues of the amino acid sequence of SEQ ID NO: 2; 

a nucleotide sequence encoding a polypeptide comprising a 
conservatively modified amino acid sequence of SEQ ID NO: 2; and 
20 a nucleotide sequence that hybridizes under stringent conditions to a 
hybridization probe the nucleotide sequence of which consists of SEQ ID 
NO: 1, or the complement of SEQ ID NO: 1. 

14. The isolated polynucleotide of claim 13, comprising a nucleotide 
25 sequence encoding a polypeptide comprising an amino acid sequence of 

SEQ ID NO: 2. 

15. The isolated polynucleotide of claim 13, comprising a nucleotide 
sequence encoding a polypeptide comprising at least 20 consecutive 

30 residues of the amino acid sequence of SEQ ID NO: 2. 
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16. The isolated polynucleotide of claim 13, comprising a nucleotide 
sequence encoding a polypeptide comprising a conservatively modified 
amino acid sequence of SEQ ID NO: 2. 

5 17. The isolated polynucleotide of claim 13, comprising a nucleotide 
sequence that hybridizes under stringent conditions to a hybridization 
probe the nucleotide sequence of which consists of SEQ ID NO: 1, or the 
complement of SEQ ID NO: 1. 

10 18. A purified polypeptide comprising a sequence selected from the 
group consisting of: 

an amino acid sequence of SEQ ID NO: 2; 

an amino acid sequence of conservatively modified SEQ ID NO: 2; 
and 

15 an amino acid sequence of SEQ ID NO: 2, having at least 20 

consecutive residues. 

19. The purified polypeptide of claim 18, comprising an amino acid 
sequence of SEQ ID NO: 2. 

20 

20. The purified polypeptide of claim 18, comprising an amino acid 
sequence of conservatively modified SEQ ID NO: 2. 

21. The purified polypeptide of claim 18, comprising an amino acid 
25 sequence of SEQ ID NO: 2, having at least 20 consecutive residues. 

22. An isolated polynucleotide comprising a sequence selected from the 
group consisting of: 

a nucleotide sequence encoding a polypeptide comprising an amino 
30 acid sequence of SEQ ID NO: 4; 

a nucleotide sequence encoding a polypeptide comprising at least 20 
consecutive residues of the amino acid sequence of SEQ ID NO: 4; 
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a nucleotide sequence encoding a polypeptide comprising a 
conservatively modified amino acid sequence of SEQ ID NO: 4; and 

a nucleotide sequence that hybridizes under stringent conditions to 
a hybridization probe the nucleotide sequence of which consists of SEQ ID 
5 NO: 3, or the complement of SEQ ID NO: 3, 

23. The isolated polynucleotide of claim 22, comprising a nucleotide 
sequence encoding a polypeptide comprising an amino acid sequence of 
SEQ ID NO: 4. 

10 

24. The isolated polynucleotide of claim 22, comprising a nucleotide 
sequence encoding a polypeptide comprising at least 20 consecutive 
residues of the amino acid sequence of SEQ ID NO: 4. 

15 25. The isolated polynucleotide of claim 22, comprising a nucleotide 
sequence encoding a polypeptide comprising a conservatively modified 
amino acid sequence of SEQ ID NO: 4. 

26. The isolated polynucleotide of claim 22, comprising a nucleotide 
20 sequence that hybridizes under stringent conditions to a hybridization 

probe the nucleotide sequence of which consists of SEQ ID NO: 3, or the 
complement of SEQ ID NO: 3. 

27. A purified polypeptide comprising a sequence selected from the 
25 group consisting of: 

an amino acid sequence of SEQ ID NO: 4; 

an amino acid sequence of conservatively modified SEQ ID NO: 4; 
and 

an amino acid sequence of SEQ ID NO: 4, having at least 20 
30 consecutive residues. 
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28. The purified polypeptide of claim 27, comprising an amino acid 
sequence of SEQ ID NO: 4. 

29. The purified polypeptide of claim 27, comprising an amino acid 
5 sequence of conservatively modified SEQ ID NO: 4. 

30. The purified polypeptide of claim 27, comprising an amino acid 
sequence of SEQ ID NO: 4, having at least 20 consecutive residues. 

10 31. An isolated polynucleotide comprising a sequence selected from the 
group consisting of: 

a nucleotide sequence encoding a polypeptide comprising an amino 
acid sequence of SEQ ID NO: 6; 

a nucleotide sequence encoding a polypeptide comprising at least 20 
15 consecutive residues of the amino acid sequence of SEQ ID NO: 6; 

a nucleotide sequence encoding a polypeptide comprising a 
conservatively modified amino acid sequence of SEQ ID NO: 6; and 

a nucleotide sequence that hybridizes under stringent conditions to 
a hybridization probe the nucleotide sequence of which consists of SEQ ID 
20 NO: 5, or the complement of SEQ ID NO: 5. 

32. The isolated polynucleotide of claim 31, comprising a nucleotide 
sequence encoding a polypeptide comprising an amino acid sequence of 
SEQ ID NO: 6. 

25 

33. The isolated polynucleotide of claim 31, comprising a nucleotide 
sequence encoding a polypeptide comprising at least 20 consecutive 
residues of the amino acid sequence of SEQ ID NO: 6. 

30 34. The isolated polynucleotide of claim 31, comprising a nucleotide 
sequence encoding a polypeptide comprising a conservatively modified 
amino acid sequence of SEQ ID NO: 6. 
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35. The isolated polynucleotide of claim 31, comprising a nucleotide 
sequence that hybridizes under stringent conditions to a hybridization 
probe the nucleotide sequence of which consists of SEQ ID NO: 5, or the 
complement of SEQ ID NO: 5. 

5 

36. A purified polypeptide comprising a sequence selected from the 
group consisting of: 

an amino acid sequence of SEQ ID NO: 6; 

an amino acid sequence of conservatively modified SEQ ID NO: 6; 
10 and 

an amino acid sequence of SEQ ID NO: 6 ? having at least 20 
consecutive residues. 

37. The purified polypeptide of claim 36, comprising an amino acid 
15 sequence of SEQ ID NO: 6. 

38. The purified polypeptide of claim 36, comprising an amino acid 
sequence of conservatively modified SEQ ID NO: 6. 

20 39. The purified polypeptide of claim 36, comprising an amino acid 
sequence of SEQ ID NO: 6, having at least 20 consecutive residues. 

40. An isolated polynucleotide comprising a sequence selected from the 
group consisting of: 

25 a nucleotide sequence encoding a polypeptide comprising an amino 

acid sequence of SEQ ID NO: 8; 

a nucleotide sequence encoding a polypeptide comprising at least 20 
consecutive residues of the amino acid sequence of SEQ ID NO: 8; 

a nucleotide sequence encoding a polypeptide comprising a 
30 conservatively modified amino acid sequence of SEQ ID NO: 8; and 

a nucleotide sequence that hybridizes under stringent conditions to 
a hybridization probe the nucleotide sequence of which consists of SEQ ID 
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NO: 7, or the complement of SEQ ID NO: 7. 

41. The isolated polynucleotide of claim 40, comprising a nucleotide 
sequence encoding a polypeptide comprising an amino acid sequence of 

5 SEQ ID NO: 8. 

42. The isolated polynucleotide of claim 40, comprising a nucleotide 
sequence encoding a polypeptide comprising at least 20 consecutive 
residues of the amino acid sequence of SEQ ID NO: 8. 

10 

43. The isolated polynucleotide of claim 40, comprising a nucleotide 
sequence encoding a polypeptide comprising a conservatively modified 
amino acid sequence of SEQ ID NO: 8. 

15 44. The isolated polynucleotide of claim 40, comprising a nucleotide 
sequence that hybridizes under stringent conditions to a hybridization 
probe the nucleotide sequence of which consists of SEQ ID NO: 7, or the 
complement of SEQ ID NO: 7. 

20 45. A purified polypeptide comprising a sequence selected from the 
group consisting of: 

an amino acid sequence of SEQ ID NO: 8; 

an amino acid sequence of conservatively modified SEQ ID NO: 8; 
and 

25 an amino acid sequence of SEQ ID NO: 8, having at least 20 

consecutive residues. 

46. The purified polypeptide of claim 45, comprising an amino acid 
sequence of SEQ ID NO: 8. 

30 

47. The purified polypeptide of claim 45, comprising an amino acid 
sequence of conservatively modified SEQ ID NO: 8. 
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48. The purified polypeptide of claim 45, comprising an amino acid 
sequence of SEQ ID NO: 8, having at least 20 consecutive residues. 

49. An isolated polynucleotide comprising a sequence selected from the 
5 group consisting of: 

a nucleotide sequence encoding a polypeptide comprising an amino 
acid sequence of SEQ ID NO: 14; 

a nucleotide sequence encoding a polypeptide comprising at least 20 
consecutive residues of the amino acid sequence of SEQ ID NO: 14; 
10 a nucleotide sequence encoding a polypeptide comprising a 

conservatively modified amino acid sequence of SEQ ID NO: 14; and 

a nucleotide sequence that hybridizes under stringent conditions to 
a hybridization probe the nucleotide sequence of which consists of SEQ ID 
NO: 13, or the complement of SEQ ID NO: 13. 

15 

50. The isolated polynucleotide of claim 49, comprising a nucleotide 
sequence encoding a polypeptide comprising an amino acid sequence of 
SEQ ID NO: 14. 

20 51. The isolated polynucleotide of claim 49, comprising a nucleotide 
sequence encoding a polypeptide comprising at least 20 consecutive 
residues of the amino acid sequence of SEQ ID NO: 14. 

52. The isolated polynucleotide of claim 49, comprising a nucleotide 
25 sequence encoding a polypeptide comprising a conservatively modified 

amino acid sequence of SEQ ID NO: 14. 

53. The isolated polynucleotide of claim 49, comprising a nucleotide 
sequence that hybridizes under stringent conditions to a hybridization 

30 probe the nucleotide sequence of which consists of SEQ ID NO: 13, or the 
complement of SEQ ID NO: 13. 
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54. A purified polypeptide comprising a sequence selected from the 
group consisting of: 

an amino acid sequence of SEQ ID NO: 14; 

an amino acid sequence of conservatively modified SEQ ID NO: 14; 
5 and 

an amino acid sequence of SEQ ID NO: 14, having at least 20 
consecutive residues. 

55. The purified polypeptide of claim 54, comprising an amino acid 
10 sequence of SEQ ID NO: 14. 

56. The purified polypeptide of claim 54, comprising an amino acid 
sequence of conservatively modified SEQ ID NO: 14. 

15 57. The purified polypeptide of claim 54, comprising an amino acid 
sequence of SEQ ID NO: 14, having at least 20 consecutive residues. 

58. An isolated polynucleotide comprising a sequence selected from the 
group consisting of: 

20 a nucleotide sequence encoding a polypeptide comprising an amino 

acid sequence of SEQ ID NO: 16; 

a nucleotide sequence encoding a polypeptide comprising at least 20 
consecutive residues of the amino acid sequence of SEQ ID NO: 16; 

a nucleotide sequence encoding a polypeptide comprising a 
25 conservatively modified amino acid sequence of SEQ ID NO: 16; and 

a nucleotide sequence that hybridizes under stringent conditions to 
a hybridization probe the nucleotide sequence of which consists of SEQ ID 
NO: 15, or the complement of SEQ ID NO: 15. 

30 59. The isolated polynucleotide of claim 58, comprising a nucleotide 
sequence encoding a polypeptide comprising an amino acid sequence of 
SEQ ID NO: 16. 
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60. The isolated polynucleotide of claim 58, comprising a nucleotide 
sequence encoding a polypeptide comprising at least 20 consecutive 
residues of the amino acid sequence of SEQ ID NO: 16. 

5 61. The isolated polynucleotide of claim 58, comprising a nucleotide 
sequence encoding a polypeptide comprising a conservatively modified 
amino acid sequence of SEQ ID NO: 16. 

62. The isolated polynucleotide of claim 58, comprising a nucleotide 
10 sequence that hybridizes under stringent conditions to a hybridization 

probe the nucleotide sequence of which consists of SEQ ID NO: 15, or the 
complement of SEQ ID NO: 15. 

63. A purified polypeptide comprising a sequence selected from the 
15 group consisting of: 

an amino acid sequence of SEQ ID NO: 16; 

an amino acid sequence of conservatively modified SEQ ID NO: 16; 
and 

an amino acid sequence of SEQ ID NO: 16, having at least 20 
20 consecutive residues. 

64. The purified polypeptide of claim 63, comprising an amino acid 
sequence of SEQ ID NO: 16. 

25 65. The purified polypeptide of claim 63, comprising an amino acid 
sequence of conservatively modified SEQ ID NO: 16. 

66. The purified polypeptide of claim 63, comprising an amino acid 
sequence of SEQ ID NO: 16, having at least 20 consecutive residues. 

30 



67. An isolated polynucleotide comprising a sequence selected from the 
group consisting of: 
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a nucleotide sequence encoding a polypeptide comprising an amino 
acid sequence of SEQ ID NO: 18; 

a nucleotide sequence encoding a polypeptide comprising at least 20 
consecutive residues of the amino acid sequence of SEQ ID NO: 18; 
5 a nucleotide sequence encoding a polypeptide comprising a 

conservatively modified amino acid sequence of SEQ ID NO: 18; and 

a nucleotide sequence that hybridizes under stringent conditions to 
a hybridization probe the nucleotide sequence of which consists of SEQ ID 
NO: 17, or the complement of SEQ ID NO: 17. 

10 

68. The isolated polynucleotide of claim 67, comprising a nucleotide 
sequence encoding a polypeptide comprising an amino acid sequence of 
SEQ ID NO: 18. 

15 69. The isolated polynucleotide of claim 67, comprising a nucleotide 
sequence encoding a polypeptide comprising at least 20 consecutive 
residues of the amino acid sequence of SEQ ID NO: 18. 

70. The isolated polynucleotide of claim 67, comprising a nucleotide 
20 sequence encoding a polypeptide comprising a conservatively modified 

amino acid sequence of SEQ ID NO: 18. 

71. The isolated polynucleotide of claim 67, comprising a nucleotide 
sequence that hybridizes under stringent conditions to a hybridization 

25 probe the nucleotide sequence of which consists of SEQ ID NO: 17, or the 
complement of SEQ ID NO: 17. 

72. A purified polypeptide comprising a sequence selected from the 
group consisting of: 

30 an amino acid sequence of SEQ ID NO: 18; 

an amino acid sequence of conservatively modified SEQ ID NO: 18; 
and 
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an amino acid sequence of SEQ ID NO: 18, having at least 20 
consecutive residues. 

73. The purified polypeptide of claim 72, comprising an amino acid 
5 sequence of SEQ ID NO: 18. 

74. The purified polypeptide of claim 72, comprising an amino acid 
sequence of conservatively modified SEQ ID NO: 18. 

10 75. The purified polypeptide of claim 72, comprising an amino acid 
sequence of SEQ ID NO: 18, having at least 20 consecutive residues. 

76. An isolated polynucleotide comprising a sequence selected from the 
group consisting of: 

15 a nucleotide sequence encoding a polypeptide comprising an amino 

acid sequence of SEQ ID NO: 20; 

a nucleotide sequence encoding a polypeptide comprising at least 20 
consecutive residues of the amino acid sequence of SEQ ID NO: 20; 

a nucleotide sequence encoding a polypeptide comprising a 
20 conservatively modified amino acid sequence of SEQ ID NO: 20; and 

a nucleotide sequence that hybridizes under stringent conditions to 
a hybridization probe the nucleotide sequence of which consists of SEQ ID 
NO: 19, or the complement of SEQ ID NO: 19. 

25 77. The isolated polynucleotide of claim 76, comprising a nucleotide 
sequence encoding a polypeptide comprising an amino acid sequence of 
SEQ ID NO: 20. 

78. The isolated polynucleotide of claim 76, comprising a nucleotide 
30 sequence encoding a polypeptide comprising at least 20 consecutive 
residues of the amino acid sequence of SEQ ID NO: 20. 
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79. The isolated polynucleotide of claim 76, comprising a nucleotide 
sequence encoding a polypeptide comprising a conservatively modified 
amino acid sequence of SEQ ID NO: 20. 

5 80. The isolated polynucleotide of claim 76, comprising a nucleotide 
sequence that hybridizes under stringent conditions to a hybridization 
probe the nucleotide sequence of which consists of SEQ ID NO: 19, or the 
complement of SEQ ID NO: 19. 

10 81. A purified polypeptide comprising a sequence selected from the 
group consisting of: 

an amino acid sequence of SEQ ID NO: 20; 
* an amino acid sequence of conservatively modified SEQ ID NO: 20; 
and 

15 an amino acid sequence of SEQ ID NO: 20, having at least 20 

consecutive residues. 

82. The purified polypeptide of claim 81, comprising an amino acid 
sequence of SEQ ID NO: 20. 

20 

83. The purified polypeptide of claim 81, comprising an amino acid 
sequence of conservatively modified SEQ ID NO: 20. 

84. The purified polypeptide of claim 81, comprising an amino acid 
25 sequence of SEQ ID NO: 20, having at least 20 consecutive residues. 

85. A method of modulating arrestin 1 biological activity, the method 
comprising: 

administering an arrestin 1 biological activity- modulating amount 
30 of a mosquito olfaction molecule binding compound; 

contacting the arrestin 1 with the mosquito olfaction molecule 
binding compound; and 
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modulating arrestin 1 biological activity through the arrestin 1 
contact with the mosquito olfaction molecule binding compound. 



5 



10 
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Figure 1 



Anopheles gambiae arrestin 1 cDNA sequence (SEQ ID NO: 1) 

5 

ACAGGAACGACGGTTGTGATCCCTCCACTGGTGGTGACACGAATCATAAGCATT 
ATTTCATACCTAAAAAACAAAATCTACAAAAAAAAGCTTCATTCCCATCGAAAA 
AACTTTCTTGTGAAATCAACCGAGCTAACAAACAACATCCTGTGCAAAATCTAGC 
AGTGAAAGTGTGATATCGTATACCTGTACCTGTAAACCGTTGTGCGCGTGTGTGC 

1 0 CTTTGTGTATCAATTTTGTGGAAAACAGAAAATAC ATCAAAATGGTTTACAATTT 
CAAAGTCTTCAAGAAGTGCGCCCCTAATGGAAAGGTTACGCTGTACATGGGCAA 
GCGTGACTTTGTAGACCACGTTTCCGGCGTTGAACCGATCGATGGTATCGTCGTC 
CTCGATGATGAGTACATTCGTGACAACCGTAAGGTATTCGGTCAGATTGTCTGCA 
GTTTCCGCTACGGCCGCGAAGAGGACGAGGTGATGGGACTAAACTTCCAGAAGG 

1 5 AGTTATGCCTCGCTTCCGAAC AGATCTACCCGCGTCCGGAAAAGTCGGACAAGG 
AGCAGACCAAGCTCCAGGAGCGACTGCTGAAGAAGCTGGGTTCGAACGCCATCC 
CGTTCACGTTCAACATCTCGCCGAATGCTCCGTCTTCGGTCACGCTGCAGCAGGG 
CGAAGATGATAATGGAGACCCGTGCGGTGTGTCGTACTACGTGAAGATCTTTGCC 
GGTGAGTCGGAAACCGATCGTACGCACCGTCGCAGCACCGTTACGCTCGGCATA 

20 CGCAAGATCCAGTTCGCACCGACCAAGCAGGGCCAGCAGCCGTGCACGCTGGTG 
CGCAAGGACTTTATGCTAAGCCCGGGAGAGCTGGAGCTCGAGGTCACACTAGAC 
AAGCAGCTGTACCTGCACGGGGAGCGAATAGGCGTCAACATCTGCATCCGCAAC 
AACTCGAACAAAATGGTCAAGAAGATTAAGGCCATGGTCCAGCAGGGTGTGGAT 
GTGGTGCTGTTCCAGAATGGTAGCTACCGCAACACAGTGGCATCGCTGGAGACT 

25 AGCGAGGGTTGCCCAATTCAGCCCGGCTCCAGTCTGCAGAAGGTAATGTACCTCA 
CGCCGCTGCTGTCCTCGAACAAGCAGCGACGTGGCATCGCCCTGGACGGTCAGA 
TCAAGCGTCAGGATCAGTGTTTGGCCTCGACAACCCTCTTGGCTCAACCGGATCA 
GCGAGATGCTTTCGGCGTTATCATATCGTATGCCGTAAAGGTTAAGCTTTTCCTC 
GGCGCACTCGGCGGCGAGCTGTCGGCGGAACTTCCATTTGTGCTGATGCACCCAA 

30 AGCCCGGCACCAAGGCTAAGGTCATCCATGCCGACAGCCAGGCCGACGTAGAAA 
CTTTCCGACAGGATACAATCGACCAGCAGGCATCAGTTGACTTTGA ATAGA CGA 
CGCAACGGTTTGGAAATGCTACCTACTACCCCAGGCATGGGCTAACACGACGAA 
CGAACTACTACTACTAAGCATAAAAAACAGGAAAAAAAATGGAAAACTTAAAA 
AATGGATCATACAACCGAACGCAAACGACCTACGACGATCGATCTCACTTCCCC 

3 5 GTCTTTTTCATCCTAAGC AATAG A ACGATGGTAG AAA AGGAAG AT AAAGATGGA 
GAGAAAGTCACGTGTATCAATGACGACGACTACCAAAACTGAAGACGTAACACA 
TGTTCCCCAGCGAGCGGTAACTGTTCTGTTCTGACACCTTCCGCTCGACAATGTA 
CCTTTTAAAAACATACAAATTAGAAGTCGTCTTCACTACCTTCAACCAATCCAGC 
CACTTTGGTATATACTTTTCATAGAATCCTTCTGAGCGCAAGGACCCTATTGAAA 

40 TTCAGTGTTATTTTGTAACTGCGACCAAATGCCTAGCTGAATGTTGTTGAACGAG 
TTATGTACATCAAAAGATTGAATAAAACAAAAAAAAAAAAAAAAAAAAAAAAA 
AAAAAAA 



45 
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Figure 2 

Anopheles gambiae arrestin 1 amino acid sequence (SEQ ID NO: 2) 



MVYNPKVFKKCAPNGKVTLYMGKRDFVDHVSGVEPIDGIVVLDDEYIRDNRK 

VFGQIVCSFRYGREEDEVMGLNFQKELCLASEQIYPRPEKSDKEQTKLQERLL 

KKLGSNAIPFTFNISPNAPSSVTLQQGEDDNGDPCGVSYYVKIPAGESETDRTH 

RRSTVTLGIRICIQFAPTKQGQQPCTLVRK3DFMLSPGELELEVTLDKQLYLHGE 

RIGVNICIRNNSNKMViaaKAMVQQGVDVVLFQNGSYRNTVASLETSEGCPIQ 

PGSSLQKVMYLTPLLSSNKQRRGIALDGQIKRQDQCLASTTLLAQPDQRDAFG 

VIISYAVE^^FLGALGGELSAELPFVLMHPKPGTKAKVIHADSQADVETFRQ 

DTIDQQASVDFE 
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Figure 3 a 



Anopheles gambiae odorant receptor 1 genomic sequence (SEQ ID NO: 9) 

5 

Features: 

1) Presumed Untranslated 5' and 3' regions are underlined . 

2) Potential TATA box transcription initiation signal is double u nderlined. 

3) Putative Start (ATG) and Stop (TAA) codons are in BOLD. 
10 4) Introns are ^tent^atively assigned and are shown in lower case. 



AGCTTTGTTCATTTATGTTGAAATCTAGCCCATTTTGTATAGTGCTGAACGACGAAGAACAT 
ACGAAAGTACCTCGTCCGAACACTATCAACATTAATTATACCAAGCTAGAAGAAGATATTTA 

15 TAGTCAAGCCT CAACATCATAGGAAACTTTAGCAAAAC CATTTAATTTACATGATGATAAGT 
C C C AC CT CTTAC C C C AG CACAGGTTTGAGAAGGAC GAAAGT AT CTTT ACGATAATAT TACT C 
TAAGGTAGTTTTTGAATAAAATAAAAATTTACGTGCAAGTGGTGGCATCGGACATCATTCGA 
AAGAAT CTACTAAGT CATACACACAC CCAAGAC GAC CGACGTAGTTT CATC TAGAAAAAACG 
GGTCAGCTCCATCGAACACGTCAGGACATAACTGCGACATGCGTATGGTCAGTTCCACTAGT 

20 GCCAACACTGGTTCCAGGGCACTACCTTCCGAAGCAGTAGAACCTAATGTATTGGAAATTAT 
TAGGACATACTGCAACATGCATATGGCTAGTTCCGCTGGTACCAACGATGGCACCAGGACAC 
TATCTGCGGCCTTGTAAAATCACTGTAAAATCTATACAAAAACGGCTTTACCCATACTTTAT 
CACAAAAACGGCAGGTGAGGGCTGGATTGCTTCAAAGCATTAG AAATATATA ATTTCAAAGT 
C C AT AAT CT C CTT AAAAGATAGACAa C AGTAGAGAAC AC AT TT AGTG CT CT TTT CGTT CGAG 

25 TTAGTTGCCTTCTCAAGTAAGCGTTTAATGCTCAATTGTTGTAGATTCGTTGGATGACTCTC 
GCTACGTGCTATAGTGGTCAATACTTCCAATTAGATTTCATAATTAGTTTCCAATTGTCCAC 
GGAAAAC C Ca C AAAAGAAAAAAAAACTTGTAT CTAGGGT GGAATTTT T C GAGAAC AATT GGA 
CACTTCATK ^^^^^^^@l^SyTCA4AATGT'r:^ ■ & 1 j 7 AAAC A C C GTTGGA' ? Z : TT^gt 
cgcatttcaattctccaaattctgcaqaataattctqcaaa-tttdcaaaactqctcdacca 

30 ccaataattccaattaatcatctgaacatttaaaactgataattaagatgagtaattgcttc 
gtcatcacctaagaaatcgattagtttggataaaaagaacaaattgaaatacaataaagtcc 
ctgaattttattcgaataacggcttgaactcatttatttcaaaaacctttgagaaattcctc 
gttgaaaattggtctcctatagttctgctaacgggccacttcaaaagcaagaactaacaaaa 
tcataattatggtgcaagtaactatcagtaccagtaatcgccattaaaaacttttcctcaat 

35 ttgcggctcgttaccggctaaatacagagcagagtaacgggaagtgatcaacgtcgctatta 
gtataacgaggaacgccctccgaaggtgtgttgaaggaccttttcaaattgaaaccaagtac 
tgtttccagttttaaattggatagttataaaatgagccgttcaacgatcgggcatcatttga 
gtttcatcttcgaggagaaatagatcagtgccactgtttaaccgaaagtaatgaagctgaac 
aaactgaacccacggtgggatgcgtacgatcgacgggattcgttctggttgcagttgctttg 

40 tttgaaatattta ^^^^^^^^mW^^^^^^^^mW^^^^^^^MW^MM^ 
TCGCGTACGGTTGGGCTT.TGCGGATCATGTTTCTACATCTGTACGCTCTAACGCAAGCCCTA 
TACTTCAAGg ATGTG AA GGATATTAATg t gag tctcia g t tagct'attaqtgttccacctgt 
cca£aa^ 

TG ACQ TTGAT CTACAAGCTGG AAAAGT TT AAC1 A.C AA CAT C G C AC GG AT T C AG G C T T' GT C T G 
45 C G C A AG C TT AA C T G C AC ACT G TAT C A C C C G A A A C A G C G C G A A GAATT C AGgfla.kg^ct^gctZg 
ggaaataEgac 

ATCGATGAGTGGAGTGTTTTGGCTGATGATCTTTCTCATGTTTGTGGCTATCTTCACCATCA 
TCATGTGGGTTATGTCGCCAGCCTTCGACAATGAACGTCGTCTGCCcGTGCCGGCCTGGTTC 
CCGGTGGACTATCACCATTCGGACATAGTGTACGGTGTACTGTTCCTGTATCAAACCATTGG 




WO 02/059274 



PCT/US02/02549 



4/23 

CATCGA-CCATCACTCCAAAGTGTACGGTACGATGTACnCTAAAGTA'ACGGAGTGTGTGCTGT 
: :caa< t :fctcacag<5ttcggcgatgaagtgcaggacattttccaag 



PGTTA CGTAG 



-gggaagaatmcaafaccs^ 

TT-TGTTGGGTTT-TCCAAGTACTTCAAGTTCGATA^GCOTAC^AQCCAAGCAATGATATTTTT 

10 agpACTCTTAAAGATGTTCACATGAAGGTGGGAAGTGTGTTGAAGGTTACGCTAAATCl-'rCA 
CACATTTTTGCAGg t a t g't aat: t a t g c t'g t gg t a 1 1 1 age 1 1 gaaat a age'tacaaacf t: £q 
aaagtaatf fcaatctgttttgtagAT"I\V T GAAGC V^TCGT ^ f^^&TCTr^c '''G ^BSSSSB SPi 
Jg^MGT^TGGcGtTi A J. A JX' C i; TAi ^TGTTGAAATT^A^TTC^?A^T 
kaTaT 1 ^^ aGTTTTCAATTAG 

15 



CCTTTTCCAAAATTTATCAAATTGATTTCGAATTGATTGCAGAGTTTCAGGAATTTAATCTG 
ATAGGATATCTTGTTTATCCAATAGAGGTGTGGAAGCGTTCCCAAGCCATTCGTTTGATAGT 
TTATAGCACCGTCGAGCAGTTGATCGCTGTGATCGCTAGGCGCACCTGATTTTATCTTTATC 
TGGCACCTGTTATGGCAAGGGCGCTTTTCACACGTTTCACACAATATAATGCACATGTATAA 
TGCATTCTTACTTTAGCATTTTTGTTACATATAATACCAAAATTATGCATTTTTATTGTCAC 
20 GCAACGATTAGAGGATGACTTCACAAAGGTCGATCTAGTGGTAGGAGGTATACAATTATACC 
TCTCAAAATCTCACAGCAtAATGAGAAACAAAAGGATACCAAGCATACCCTTTTTTTAGTTG 
ACAATTTCATTTGATTTATGTAATAAAGCACTGCaCGTCGACTTCCTAAAA 



25 



End of Figure 3a 
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Figure 3b 

Anopheles gambiae odorant receptor 1 amino acid sequence (SEQ ID NO: 4) 



MKKDSFFKMLNimRWILCLWPPEDTDQATRNRYIAYGWALRIMFLHLYALT 

Q>LYFKDVEI)INDIANALFVIMTQVTLIYKLEKFNYNIARIQACLRKLNCTLY 

HPKQREEFSPVLQSMSGWWEMIFLMFVAIFTIIMWVMSPAFDNERRLPVPA 

WFPVDYHHSDIVYGVLFLYQTIGIVMSATYNFSTDTMFSGLMLHINGQIVRLG 

SMVKKLGHDWPERQLVATDAEWKEMRKRIDHHSKVYGTMYAKVTECVLF 

HKDILRIYLRASMRVCNYHLYDTAATTGGDVTMADLLGCGVYLLVKTSQVFI 

F CYVGNEISYTDKFTEFVGFSNYFKFDKRTS QAMIFFLQMTLKD VHIKVGS VL 

KVTLNLHTFLQIMKLSYSYLAVLQSMESEZ 
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Figure 4a 

Anopheles gambiae odorant receptor 2 genomic sequence (SEQ ID NO: 10) 

5 

Features: 

1) Presumed Untranslated 5' and 3' regions are underlined . 

2) Potential TATA box transcription initiation signal is double underlined . 

3) Putative Start (ATG) and Stop (TAA) codons are in BOLD. 
10 4) Intro ns are tentatively assigned and are shown in lower case. 

5) Exons are higialigSlgdl 

GGGATCCTCTAGAGTCGACCTGCAGGCATGCAAGCTTCCCTCACCGTGACGTGCTAGAAATG 
GTTCAACATACTCGTCGGGCAGAGCGAAGACGACGAACAGCGGAATGTGCCAGGAAATGTAA 

15 TGAGATATCACAGCAAGTGAACCCAAACCGAGCTGTGGGCTTTGTGTTGCGCTTTAAAAATG 
GCCCTTCCTTCGCCGCATCTGCTTGGTTTCACACGCTTTCCCAGGAAATCCACTGACCACTG 
GC CACACAT CAAC CACCGGAGCGGGAGC CTCAGTG CC CAGCGAAGCATATAATTTGCTCAAA 
AAGTCACGGTACTCAATTAATTTGATTATAATCAATTTCGTGGCTTCCAACACACCCTTCTT 
CCACAATCCATCGCCGAGTGAGCGAGTATAAAGGTGAAGAAACGTACCTTGCGCTTGCTCAC 

20 TAACTGAACCGGATTTCAAAAAGGAACATAAACCGCAACCCACAGCCGAAAA TGCTGATCGA 
AGKGTGTX^^^^m^GTGTGAP^GTGCGAGTGTGGCTGTTC r rGGTCGTATCTGCGGGGGG 
CGCGGTTGTCCCGCTTTCTGGTCGGCTGCATCCCGGTCGCCGTGCTGAACGTTTTCCAGT.TC 
CTGAAGCTGT ACT CGTC CTGGG G CGACATGAGCGAGCTC ATCATCAACGGATACTTTAC CGT, 
GCTGTACTTTAACCTCGTCGtacgtg<ggcgaggggaggggcaataaccttcccacttggt:gg 

25 atattttcataGGttt'tccatgtgtttttttattctctgtttgttgccatccagCTCCGAA'G 
CTCCTTTCTCGTGATCAATCGACGGAAATTTGAGACATTTTTTGAAGGCGTTGCCGCCGAGT 
ACGCTCTCCTCGAGgtaagtcattggtttttctagtt-ttgggggagttgtttacaccataa 
ccacccccgacggtaacatttgatcg 

GCGACCCGTGCTGGAGCGGTACACACGGCGGGGACGCATGCTATCGATATCGAATCTGTGGC 
30 TCGGCGCCrTCATTAGTGCCTGCTTC 

ccg7acggggtcaggataccgggcgtggacgtggtggggaggccgaggtaccaggtcgtgtt 
tgtgctgcaggtttaccttaccttccccgcctgctggatgtacatcccgttcaccagcttct; 
acgcgacctgcacgct3tttgcgctcgtccagatagcggccctaaagcaacggctcggacgc 
ttggggcgcgacagcggcacgatggcttcgaccggacacagcgccggcacactgttcgccga 

35 G C T G AAG G A GT GT C T AAAG TAT C A C AAA C AAAT CATCCAGt aagt agacgc t ag t agac t c g 
accggattgcccttccctcggggaggggagghttgcta*t''C'tcgggatgcggcagcacgcata 
cacacaaaccggaagccattaattctcccgttttcatgcccgcacgggcactgggtcatgtt 
tcacatccttccttcctttccaaacacacacacgcgcgcgtgcacgtacagATATGTTCATG 

atgtcaactcactcgtcacccatctgtgtctgctggagttcctgtcgttcgggatgatgctg 
40 tgcgcactgctgtttctgct^ 

gtctccggactctca€ftcgggactca:atcgttccatctctcaa 

gagatgataatgattggatggtacatcttgatgatagtctcgcagatGtttgcgttgtattg 

GGATGCGAACGAG^^ 

cEa^agatcggccgtcttacattgttgtgtttctgcatggggatcggttttgtttttcctct 
45 ccatttcagAGGCTAGGCATTGGCGATGGCATTTACAATGGAGCGTGGCCGGACTTTGAGGA 
ACCGATAAGGAAACGGTTGATTCTxAA^TTATTGCACGTGCTGAGGGACCGATGGTGGTAAGt't 
feg§€^atcgatgctctgttcaat"gaacatggcacagaaggctgtgtaaatagctgtccatt 
aataagttttttcagaatgtatcgtttttagttgatttaaacgcattgttctatgcaatggt 
agcaacaatagaccgcctttattaatccaagcttcctttaggattgatttttattttaagag 
50 aaagataaaccatttttagtaaccaatttagttacaggaaccaaaatacagaatttattatt 
attattattattattattattattattattattattattattattattattattattattat 
tattattattattattataattattattattattattattattattattattattattatta 
atattattattattattattattattactattattattataattattacttttattattatt 



WO 02/059274 



PCT7US02/02549 



7/23 

attattattattattattattattattattattattattattattattattataattatgat 
tattattattattattattattattattattattataacaataataattattattattattt 
attattaattaattaattfcattattattaattattattattgttattcattattatacatta 
ttatcataataataattttattatgattattattattattattattattattattattatta 
5 ttattattattcttattattattattattattattattattaatattatttttaatattatt 
attattattattactattcttattataattatttttttttattattattattattattatta 
ttattattattattattattattattgctattgttattattattcttattattgctattgtt 
attattattattcttattattgttgttgttgttgttcttattattgttgttgttgttattct 
tattattgtttattattattgtttttttttattctctaattattccagtaatccataataaa 

10 aaataataaagtaaataaatagtaaatagtaaataattccagtaactgtagtaatacacaat 
aatctctaagaattaaaattgcattttgtaatgaaatatgttgattgttcgaatagttcaga 
aaaacttaaaaatgcctcagcattaaacagttttgaggttgttcagggcatttagtttagat 
attttagtattttaaagcatttgttttcattactacaaaaaagcaaatttatgagtgaatta 
ctttcagttcttctaaacgcctatgtgtatgcaattacataacaatagctctcttttttatt 

15 gcatttttccttagtaatctaaatccaatctcttctttccctcttgcag^!IfJOmGTGGGG^ 
A C GT G T A C C C G ATG A C G T T G G AAAT G T T T C AAAAAT T G C T C AA CG T G T C CT ACT C C TAT T T C 

A CACTGCTG GG CCGAGTGTAC AACT AAA CT TAACCGGT AAAC AAAC AAAAAT C C G CTC AT CA 
CTATGOUVAGA^ 

GCTTATGCCACGGGATTTGGTGGAAAGTTATTGCACTGAAGCTCTTTCAGCCAAATTTTCAT 

20 GGAGGTTCGCTCTCAACCAACGCATTGAAGCGAATAAAAGTATCAGCAACCAGGCGACGGTG 
AAAAAACGCTGCATTATTGTGCTTGGTTCAGCATTCCAGCGAATGAGTGTTAAACTTTTGGA 
TTC AAAAGT CGCGATGCTCACGATACGGAGCGGTGTGTTGTT CGAT C C GC GGAGTG CAGT CG 
CAAGCCGGTGATGTTGCCGGTGGAAATGCACAGATCGACACAGCGATAGATAATCGTTTGTT 
CGCGTAAATGGGAGGGAAAAAAGTAAGCTGCCAGCTACTTCATTTCCATGTTAATTGAAACT 

25 CAAGCCAACGAACATGCAGAACCCGGTTGGTTGTGTGTGTCCGCTCCGGGAAAGGTCTGTGC 
TCCGGGGCATGGATTCTTTCCCCCTCCGGGTGGTTGGGGGTATTGTTTAGGTTTTTATTTTA 
CAAATTCATATCCTTCCGCTTCCGCATGAGCCGACCCGGTGGGTGGGCGAGACAGATGTGCG 
GCGGGCAACAAAACTATGCACGAACATGGCCAACAAACACAGCTTCTATCTCATCTCTGTGT 
CGCACTGTCTCGCTTTCCGGCTGCGTTGCTTGTAGTACTATCATTGTTTTAGTCCACGGGTT 

30 TACTTCTAATTCCATTGCACCACGCAAAAAGGCTCATCCTTTGCTCGTTCCGGTTGCAACTT 
CGACAAGCGCATGGTTGGGATACGAACAAAAAACCAACTACTCCACCCACTACTACTACTAC 
TGCCACCACCACTAACAACACTACACTTGGTTGGGAGCTTGCAGACCCACAAGCAAACAACG 
ATACAAGCTAGCTAGCTGCTGTGTGCGCTCGAGTCAGCCGACGGTACAAGGTTTAACCGGTA 
CAAGCAACTGCCGGACCGATCCCAAAACTCTGACAAGGCACGGGGCCGCATCCGGCAGTACG 

35 GTCGGAAAACATGGAAATGTTTAATTAAAACTGTAATTGTCAATCGCTGCTACAAGTTGTGA 
CAC AGGGAGAGAGAGAGACAGAG CGCGC CCGATGGTGATGGTGTAAAAGATAGATACAGGAA 
AAGAGCGAGAAACATTGGTACGATTTGGTGTGGTTAGCAAATTTGATTTCCACTGATTTTGA 
GTGCAAATTTAATGCATCGAAAATTTGCCATTCAGGGTAAAGTTGCTCGTGGACGGATCCCC 
CGGGCTGCAGGAATTCGATATCAAGCTTATCGATACCGTCGACCTCGAGGGGGGGCCCGGTA 

40 CCCAGCTTTTGTTCCCTTTAGTGGA 



End of Figure 4a 

45 
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Figure 4b 

Anopheles gambiae odorant receptor 2 amino acid sequence (SEQ ID NO: 6) 

5 

MLIEECPIIGVNVRVWLFWSYLRIIPRLSRFLVGCIPVAVLNVFQFLKLYSSWG 
DMSELIINGYFTVLYFNLVLRTSFLVINRRKFETFFEGVAAEYALLEKNDDIRP 
VLERYTRRGRMLSISNLWLGAFISACFVTYPLFVPGRGLPYGVTIPGVDVLAT 
PTYQWFVLQVYLTFPACCMYIPFTSFYATCTLFALVQIAALKQRLGRLGRHS 
1 0 GTMASTGHSAGTLFAELKECLKYHKQIIQYVHDLNSLVTHLCLLEFLSFGMM 
LCALLFLLSISNQLAQMIMIGSYIFMILSQMFAFYWHANEVLEASLGIGDArrN 
GAWPDFEEPIRKRLILIIARAQPTDGGKIKVGNVYPMTLEMFQKLLNVSYSYF 
TLLRRVYN 

15 
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Figure 5 a 

Anopheles gambiae odorant receptor 3 genomic sequence (SEQ ID NO: 11) 



Features: 

1) Presumed Untranslated 5' and 3' regions are underlined . 

2) Putative Start (ATG) and Stop (TAA) codons are in BOLD. 

3) Introns ar^entatively assigned and are shown in lower case. 
10 4) Exons are SKSEffli- 

AAGCAGAACACATCAAGAAGCAATTAGGTGTGTCGTACGTTAGCAAGTAGTTCGCGAGGAGG 



15 



20 



25 



30 



35 



AATAAAATA' 




jgtaacat'caat'cacagtttga 
iGAGCTGCTGTTTCCCACCCTGGAAATGGC 

:gcagcaccgagagcgcccctgcacgcact 

40 gacgtattttggctactttgacgtttgcacctttgacagctgaaggacagggtacaattttt 
gctgctgttattacgcgcagcgcattggatacgaaaacattggccacaagttctacgatttt 
agcgtttatttactgttcgtagcagcttttttccacaataaacacacacaataacgtaccga 
c agtatt cttttc attgtaggatagagaagg cg c cgg cc ag c agc caaaacgcgc cg gaaaa 

CGAAAGGCGGCACCAGCGGGGGAAAAACACGGGAGCAAAACGAGAACAGAACGCAGTAAACA 
45 ACAAAAC CGGCCGGAACAACAACGGTGCCGGAAACGA 



50 
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Figure 5b 

Anopheles gambiae odorant receptor 3 amino acid sequence (SEQ ID NO: 8) 



MPSERLRLITSFGTPQDKRTMVLPKLKDETAVMPFLLQIQTIAGLWGDRSQR 

YRFYLIFSYFCAMVVLPKVLFGYPDLEVAVRGTAELMFESNAFFGMLMFSFQ 

RDNYERLVHQLQDLAALVLQDLPTELGEYLISVNRRVDRFSKIYCCCHFSMA 

TFFWFMPVWTTYSAYFAVRNSTEPVEHVLHLEEELYFLNIRTSMAHYTFYVA 

IMWRTIYTLGFTGGTKLLTIFSNVKYCSAMLKLVALRIHCLARVAQDRAEKEL 

NEIISMHQRVLNCVFLLETTFRWVFFVQFIQCTMIWCSLILYIAVTGFSSTVAN 

VCVQIILVTVETYGYGYFGTDLTTEVLWSYGVAIAIYDSEWYKFSISMRRKLR 

LLLQRSQKPLGVTAGKFRFVNVAQFGKMLKMSYSFYWLKEQF 
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Figure 6 a 

Anopheles gambiae odorant receptor 4 genomic sequence (SEQ ID NO: 12) 

5 

Features: 

1) Putative Start (ATG) and Stop (TAA) codons are in BOLD. 

2) Introns are tentatively assigned and are shown in lower case. 

1 0 GGGGAACTCCCCCACCCGACCAGACGACGGAAAGCTAACGATGTGCAATTGAAT 
AGTCATTAGTAGCGTTTTTGCTCGCAAACGAACTAACCCTTTGACTTTTTAAGTTC 
ACTACGGTGAGGACAAAAATCAATAAATTAAATCGAGACCGTTGATGAGCAAAA 
GAAAAAAAAATATTTTACTGATTTTCATTTCGTTCCATCGACTACATAATCATAAT 
TATATGCCACATTTTATTATAAGTTTTTGTATCATTTTTAAACAACACAAAAATGC 

1 5 ATCCTTTCGAATATTAGTCAGGTTGTATCAAC AATGAAGTTTGAACTGTTTCA AA 
AATATTCCTCCCCGGACACGGTCTTATCCTTCGTGCTAAGGCTTTTGCATATCGTG 
GGCATGAATGGGGCAGGATTTCGGTCGCGAATTCGAGTTGGTGGCATTTTTCTGT 
TCTATTTAATCTTTCTTGTAATACCGCCACTAACGGGCGGGTACACCGATGGTCA 
CCAGCGTGTACGCACCAGTGTGGAATTCCTGTTTAATTGCAATATTTACGGCGGC 

20 AGTATGTTCTTTGCCTACGATGTGGCCACTTTCCAAGCGTTCATCCAGGAACTGA 
AGAGCCTTTCGGTTTTGGgtaatatttaattaattaaaattgcgtttattgcateatcatttgtttctctttgcagTATGCT 
CACATTCGTACAGACTAAAGTATAAGCTGACCCGGTTCAACCGTCGAGCGGATAT 
TATCGCCAAAGTGCAAACGACCTGCATGGGTGCTGTAACGCTTTTCTACTGGATT 
GCACCGATACCTTCCATCTGTGCGCACTACTACAGGTCGACCAATTCCACCGAAC 

25 CCGTGCGGTTTGTGCAACATTTAGAGGTGAAGTTCTATTGGCTCGAGAATCGCAC 
CTCAGTCGAGGACTACATAACCTTCGTGCTGATCATGCTACCCGTCGTGGTTATG 
TGTGGTTACGTATGCAATTTGAAGGTGATGACCATCTGCTGCAGCATTGGACACT 
GTACACTGTACACCAGGATGACTATAGAGATGGTAGAGCAGTTGGAAAGCATGG 
CATCAGCGGAACGAACTGCCAGCGCCATACGCAACGTGGGGCAGATGCACAGTG 

30 GTTTACTGAAATGCATTAGGCTTTTGAACACGTCAATCCGATCGATGCTGATGCT 
GCAGTGGTTGACCTGCGTGTTAAACTGGAGCATTTCTCTCATCTATCTAACGAAC 
GTGgttagttttgtcttgtttggaaatccaaaaacaaaaagatggctataattgaactttctattacagGGCATCTCGCTACA 
ATCGGTTACCGTGGTGGTAATGTTTTTTCTTGCCACTGCGGAAACTTTCCTGTATT 
GTTTACTTGGGACGCGGCTTGCGACACAACAGCAGCTGCTGGAGCACGCACTCTA 

3 5 TGCTACACGGTGGTAC AACTACCCAATAGCCTTTCGCAGCAGCATTAGGATGATG 
TTGAGACAGTCGCAAAGGCATGCACACATAACGGTGGGGAAGTTTTTTCGCGTTA 
ATTTGGAAGAATTTAGCAGGATTGTCAACTTATCCTACTCTGCTTACGTCGTACTT 
AAGGATGTAATAAAGATGGATGTACAGTGAATGTTTTTTTTTTTGGCTTGGCAAC 
GAATGAAGTTTTCCGAATCTATATTAGATCTAGAATTTAATCTAGATGTCATAAT 

40 ATGATCTTGGCCATGACCGGTTCCTGGTTTTGGAACCAATTCTCAAAACAATTTT 
GAACTTAGGGCGAGGCATGAAATGTCCCAAGAACCTATCCAAGTTCTGGAACTA 
CATATTACCGAATCTATCCCATTATTGCCTCGGAACTGGTTTGGTGCTAAATATTT 
GTCCAAATGTTGGTCCTGGACCTATCCAGACAAAGATCTTCAATTATTCCTACCA 
CTGGAACTGATTAATTGATGTAGGAAGTCATGGAGGTGTTCAGGGAGAATTTAA 

45 ACACTAATGTTCCAACTCATTATTTCAAGGGCAATTCTATTTTTTATATGCCCCTA 
CGGATTGATACGTATGTATTACTCCATTTCCTGGACTTTGTCTTATTCTTGCTGCT 
GATTGGACGTGAAATGTTGAGAAAAAGATTCTTATTTATGAGTGATACAGAGCCT 
TTAAATACTCCTACGTTGTTTGCTATTTAAGTATGGCCAGGCTAATCACAATCGCT 
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ACTAATGAACAGAATCTCTTCTAATTAAACCCTTTCGATTGATAGTGTCAATGTC 
AATGTCGAGATAATTGAACTGCAAACgATACCTACCTTAAACGGAGCAGAACAC 
ATCAAGAAGCAATTAGGTGTGTCGTACGTTAGCAAGTAGTTCGCGAGGAGGAAT 
AAAATAG 



10 
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Figure 6b 

Anopheles gambiae odorant receptor 4 amino acid sequence (SEQ ID NO: 14) 



MKFELFQKYSSPDTVLSFVLRLLffiVGMNGAGFRSRIRVGGIFLFYLIFLVIPPLTGGY 

TDGHQRVRTSVEFLFNCNIYGGSMFFAYDVATFQAFIQELKSLSVLVCSHSYRLKYK 

LTRFNRRADIIAKVQTTCMGAVTLFYWIAPIPSICAHYYRSTNSTEPVRFVQHLEVKF 

Ym.ENRTSVEDYITFVLIMLPVWMCGWCNLKVMTICCSIGHCTLYTRMTIEMVEQ 

LESMASAERTASAIRWGQMHSGLLKCIRLLNTSIRSMLMLQWLTCVLNWSISLIYLT 

NVGISLQSVTVVVMFFLATAETFLYCLLGTRLATQQQLLEHALYATRWYNYPIAFRS 

SIRMMLRQSQRHAHITVGKFFRVNLEEFSRIVNLSYSAYVVLKDVIKMDVQNVSYSY 

FTLLRRVYN 
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Figure 7 

ANOPHELES GAMBIAE 



Preferred DNA Codons 



Amino Acids 


Preferred Codons 


Alanine 


Ala 


A 


GCC 


GCG 


GCT 


GCA 




Cysteine 


Cys 


C 


TGC 


TGT 








Aspartic acid 


Asp 


D 


GAC 


GAT 








Glutamic acid 


Glu 


E 


GAG 


GAA 








Phenylalanine 


Phe 


F 


TTC 


TTT 








Glycine 


Gly 


G 


GGC 


GGT 


GGA 


GGG 




Histidine 


His 


H 


CAC 


CAT 








Isoleucine 


He 


I 


ATC 


ATT 


ATA 






Lysine 


Lys 


K 


AAG 


AAA 








Leucine 


Leu 


L 


CTG 


CTC 


TTG 


CTT CTA 


TTA 


Methionine 


Met 


M 


ATG 










Asparagine 


Asn 


N 


AAC 


AAT 








Proline 


Pro 


P 


CCG 


CCC 


CCA 


CCT 




Glutamine 


Gin 


Q 


CAG 


CAA 








Arginine 


Arg 


R 


CGC 


CGG 


CGT 


CGA AGA 


AGG 


Serine 


Ser 


S 


TCG 


AGC 


TCC 


AGT TCT 


TCA 


Threonine 


Thr 


T 


ACG 


ACC 


ACT 


ACA 




Valine 


Val 


V 


GTG 


GTC 


GTT 


GTA 




Tryptophan 


Trp 


W 


TGG 










Tyrosine 


Tyr 


Y 


TAC 


TAT 
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Figure 8 



Name 


SEQ ID NO 


FIG. Reference 


Arrestm 1 (cDNA) 


SEQ ID NO: 1 


Figure 1 


Arrestin 1 (polypeptide) 


SEQ ID NO: 2 


Figure 2 


Odorant Receptor 1 (cDNA) 


SEQ ID NO: 3 




Odorant Receptor 1 (polypeptide) 


SEQ ID NO: 4 


Figure 3b 


Odorant Receptor 2 (cDNA) 


SEQ ID NO: 5 




Odorant Receptor 2 (polypeptide) 


SEQ ID NO: 6 


Figure 4b 


Odorant Receptor 3 (cDNA) 


SEQ ID NO: 7 




Odorant Receptor 3 (polypeptide) 


SEQ ID NO: 8 


Figure 5b 


Odorant Receptor 4 (cDNA) 


SEQ ID NO: 13 




Odorant Receptor 4 (polypeptide) 


SEQ ID NO: 14 


Figure 6b 


Odorant Receptor 5 (cDNA) 


SEQ ID NO: 15 




Odorant Receptor 5 (polypeptide) 


SEQ ID NO: 16 


Figure 9b 


Odorant Receptor 6 (cDNA) 


SEQ ID NO: 17 




Odorant Receptor 6 (polypeptide) 


SEQ ID NO: 18 


Figure 10b 


Odorant Receptor 7 (cDNA) 


SEQ ID NO: 19 




Odorant Receptor 7 (polypeptide) 


SEQ ID NO: 20 


Figure lib 
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Figure 9a 



Anopheles gambiae odorant receptor 5 genomic sequence (SEQ ID NO: 21) 

5 



Predicted Exons: ITALICIZED, UNDERLINED AND HIGHLIGHTED. 
Introns: lowercase. 

10 tctagacttgaacccatgacgggcattttattgagtcgttcgagttgacgactgtaccacgggaccacccgtttatcactatcactattaattaattataat 
atgcttttgtagcgatcagcctaccgggttttgtttctctggatatcttaagttcccatttgattatcaagatagaacaacaacttgtaccttaaataatcatta 
cgtacccttaatcaacctgtgcatcaaggagttttcgcgaaagcaaaaatccgattgtctgatgttgtcttgattccatccgattcgttactggttctgcaa 
aatcgtccaataatacggcaatgtccttatcgatgcttgaatcaacatcacattgtttgcatttcgttttttgcgtgcaaatatgttatttgcaaagaaggca 
aggtaatgtgcttaagagtaaatacaattcgctgtccattttttgtccaccag^gtgccagaacccgtgccttttagtccttcgaatacatccgaccagtc 

15 azcaaRcaaetgcat cA'TGGTGV'TACCGAA GCTGTCCGAA CCGfA@GGCGTGATG€G&CTTCTACTA C 
GCCtGCAGCGTJTCGTJ'GGGCTGl (KKlGTGAACGAGGCTATCGCJACAAGTTCCGGT'rGGCAT 
TTfTAAGC^TCTGTC!FGGTAGTAGTTA TTCCGAAGGTTGCCTTCGGCTATCCA G. I TTTA GAGA CA 
ATGGTTCGCGGAACAGCTGAGCTGATTTTCGAATGGAACGTACTGTTTGGGATGTTGCTGTTTT 
CZG$CM@ 

20 gateatgattgataVaag^ 

GACTATCtGGTACGlSATCAA^ 

GTGTTTGGCCATCTTGTACTGGGTGGCTCCTTCGTCCAGCACCTACCTAGCGTACCTGGGGGC 
ACGAAA"CAGATCCGTCGGGGTCGAAGATGTGCTACACCT&GA 

GACCCGCGTCTCGCTGGYAGATTACTCCATATTCACCGCCATCATGCrGCGfACMrCTtTAm 
25 OTA GCGTA CTTCGGTGGA CTAAA GCTGCTAA CCA TCTTCAGCAA CGTGAA GTACTGTTCGGGAA 

mcmAGGCTTGmmcG^ 

iMGGAACmATeGAA^ 

acamctaRctRcmca ^TGTGTGGAGCTGTTGGAAATCATCTTTCGGTGGGTTTTTC7 i I < f iCAGTTC, 

atacagtgcgtaat&atctggtgcagcttggttgsgtacgtggccgtta 

3 0 gatctgtctaccacaccattcactgctgtgtcttgttttgtcactcttcccagtrOy O GAGCA CAAAAGCGGCAAA CGTGGG I 
'&Ta1G!FGTTTATA€TGCTA^ 

zctia seACGrGCTGCGTACGGTAGCCTCrGGTArmeG 
GGA^TGGTACTGCAGCGTGCCCAGA4ACCGGrCGGCATCTCGfGCTGG 
35 GACAttOAGGAGTW&GGA^ TG 
'GGAAAAACATGATAGT-GGWGTA GATCG 

cgacggaaagctaacgatgtgcaattgaatagtcattagtagcgtttttgctcgcaaacgaactaaccctttgactttttaagttcactacggtgaggac 
aaaaatcaataaattaaatcgagaccgttgatgagcaaaagaaaaaaaaatattttactgattttcatttcgttccatcgactacataatcataattatatgc 
cacattttattataagtttttg 



40 
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Figure 9b 

Anopheles gambiae odorant receptor 5 amino acid sequence (SEQ ID NO: 

16) 



MVLPKLSEPYAVMPLLLRLQRFVGLWGERRYRYKFRLAFLSFCLLWIPKVAFGYPD 
LETMVRGTAELIFEWNVLFGMLLFSLKLDDYDDLVYRYKDISKIAFRKDVPSQMGD 
YLVRINHRIDRFSKIYCCSHLCLAIFYWVAPSSSTYLAYLGARNRSVPVEHVLHLEEE 
1 0 LYWFHTRVSLVDYSIFTAIMLPTIFMLAYFGGLKLLTIFSNVKYCS AMLRLVAMRIQF 
MDRLDEREAEKELIEirVMHQKALKCVELLEIIFRWVFLGQFIQCVMIWCSLVLYVAV 
TGLSTKAANVGVLFILLTVETYGFCYFGSDLTSEASCYSLTRAAYGSLWYRRSVSIQR 
KLRMVLQRAQKPVGISAGKFCFVDIEQFGNMAKTSYSFYIVLKDQF 



15 
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Figure 10a 

Anopheles gambiae odorant receptor 6 partial genomic sequence (SEQ ID NO: 
22) 

5 

These are the predicted last three exons of another candidate Anopheles 
gambiae odorant receptor. 



Predicted Exons: ITALICIZED, UNDERLINED AND HIGHLIGHTED. 
10 Introns: lowercase. 



aacacccatcttatcggcaaaattagtatttaccgtttgaaagcggcttcccttcctg^ 
gcgccgcgtgcta^^^ 

15 GTGGACICTTGATGGCCtGTGGTGATGACGTCTGCTGCGCACCGTT^ 
WlfTTTCAT 

XGGTAAAGTTTGTCCTCffCAT GC TG^GCJjm 
TGAGGATATfGJGG/!^ 

CCTTGGG&IXMT7XM TGCCGCTTAUUffTT&C&AA TGGTA CCGGGAA&OGfCGG'TGGCXFlTGXB 
20 ATCGA W^GTGCT<BGKAA ttata G& GCGCA GGCAGGAGTCCGTCA ta ctga ccgca tggaaaat. 
ffiG'GCCCATCCAMTGAGTACm 

ttccc1fa^&CCTGCAA GCTTCCTGGTCCTACTTTA CCCT0^TGAAGACCGTCTA CGGGA 'A TAAp£q& 
gcgcgagagagagagagagagcagtatcgttcaccctttggatg^ 

ttgcacaatattgtaccattctatacagcttcaccacgaccaagcgtttgttgcatcaggaccaaacacgtttcgacaagccgcgtcacctgctggc 

25 
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Figure 10b 

Anopheles gambiae odorant receptor 6 partial amino acid sequence 
(SEQ ID NO: 18) 



LCLPDVAIAHVLFRIRQCTLDGGGDDVCCAPFSARESDLFISCNILFLSRPHRRLDGY 
MLVKFVLFMLCFLIELLMLCAYGEDIVESPWGDZCRLRLRMVPGRVGGVPSIRAAN 
YTPQPAVRHTDRMENLAHPNEYFQSDPASFLVLLYPPEDRLRE 
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Figure 11a 

Anopheles gambiae odorant receptor 7 genomic sequence (SEQ ID NO: 23) 

5 

Features 

1. Predicted Exons (7): 'ALL CAPS, ITALICIZED^ UNDERLINED, 

'HIGHLIGHTED * " * ' ~* 

2. Introns (6): lowercase 

10 3.5' and 3' sequences: lpwercase^do 



gcgcttacc^ 
15 ggtttcggtglgagc^ 
aagjgctgaaaaa^ 
ctttc^ccggctacgtcaccggcccg^ 

AfCACGACCCT0TTCTTCACGCACTCG0TCA 
20 HCTTC1ACCGGACGCTCGCCATCTGGAAC&A 
ACGCMCGGTAC^ 

ACCACCGTCCTGTCGGTTGTCG&& 
gc%actgggtttttgStttttctcggtggagggacgggataaaatatctgaa 
gcagagagtttgggtttgatttatcaccgcacaccgaatatcttcacggttca^ 
25 ffimcttcctctcgataaattactcatcgc^ 

TTTTCGGCGAGAGCGTCAAGACTGTG 

mGcccmcm 

TCTCTTTCATCTACCAGGTACG^ 

aaatgggactaaaaccg^cttcacagagccaacacattcctacagcaattgcataccttcgggcggtcgggactgggcaatgcagctacaacatc 

30 ctcgcctaaagttatgcaattcgagcgacaaatgttgccgtgttagggctttttgtgataatagtcgtttttttgtcctctcgcttatcaaactctatcaacgg 
aggaaatccattttcgctacaatgcctacagctcaagtttcaaggtcaatcgagcgggtggggatcaacttttttattcattttgctaacgccccatcaac 
aaattctatgttctcaatggcaaagattactgcccgcaccaatcgcccaacgaaacggcaaaagaaaagcgacgattatgaagatgtccaaaccatt 
gcccgcccgacgctttatctgatgatttgcgggatggcttttacttgtctgctactttcaggcacaaaaggaaatgaaaccagcgcaggctcgtttgcc 
ggcttgcggaggttcttcaggcactgaggctgagtacttaaatcgaacgatttttacgattctggatccagttttatgatgtggcctgcattacagtggc 

35 aattataccctgatgttcatttcattgcaltttgtaagtttgtgctggtaacgcccgtaacgattaattcttttcaaa 

tgtataacaaatgctaacgaatggaccgtacttggagggttgcggaaagtaacgttttaaaatattcatcacaatcctctgcaaacttgtgcttaattaatt 
ggtgcacaataagiltaaactgtggcggcagatgtgtcgctgtccgcttccttccttcccagcaagctcgtgcgaaataatttattccatcattttaatac 
agccgtttgtgcattttaattagcaaagcaatataaaaagcagctaaccatccccattaaaacaaagtgcttccgggcccaattgttatggcggtggaa 
agtaatggttttaccagtggaagtgtcctttcccatcgtgggtacttcgcgatattcttgtcttatacaagtgcatacagaaaaaaaggacaaatcctcct 

40 tgctatggtctaaggccagcttcggtaccgcttccgcttcgggatgtcataaagtttgatgggtgtttttaacattacttccgctcttaaccacctaatgga 
cttttcatgcttgagctaaagttaaaccagccaccagcggtacgcaccgagccacggttgatttcggcggcggcctcatccccagttttgcgccacc 
aatattgccttcattaatctgtaccctcggagcgttagggcccgcggacgagtcctcgttgtaatgcaccgccatgccacgggacgggataatccgtt 
gggacggcgcgaaagcgactatcgcggacggattggttcgaccgtgctacaacacattttatgcttcacagatttacttcctgctgttttcgatggtcc 
agagcaacctcgcggatgtcatgttctgctcctggttgctgctagcctgcgagcagctgcaacacttgaaggtaggtacggtagcaaacgtggttgt 

45 ctttacatccgcgtgcagcattatccttatcgacgtgtagtgttaacggtaaaagaggaagcgataaaaaagcaacattctctcacaccctcgatctc 
ctttattttctctctctctctctctctctctrt^^ 

CAGCCGGTTCCAAATCG^ 

gcataacacaatcccctg^agttcatttcaatgaccttaacactcggcaagctaagcgagacagtggggacagtgagaaagagagaacaa 
50 aaaaaccatcatccgtacgacatcatcgctacgtaccggtatttcaggatgaggaaataaaacgctaggggaatgaaagtgcgacagaatgataaa 
acaatccccacccaggcccccagcctggacgaacggatgtagtgtgcgaagcgagcaaaaaaagtcaaataaattgaagtttaaaaatagattttc 
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cccgtccatccgtggtggagcgtaaagcccggcggacaacttcgagcacggcgaccgtgcacagtactgtgccacagttgtagggacggataag 
ctccgttccmtttatcctttttt^ 

aatagaagatcggttctctccatttaatctatcgcgcctgtacgcctgaaactatgcactgtgctgtgaaaccgtcaagctcgagcacgacgaatggc 
ccaccgtaccacgcccgtggtgcccaaagcgcaacgcgaat^ 
5 tctctcttttgcgctttcggtgtatcgaacg 

ggttttcgaaaaaagagcgatttcttctgcgtgtgtgtgtggttttttt 

tattatcagctttagtgtttatcccacccatgccccacatcacgtctgtggagagtgggggaagcttaagtccaatg^ 
tcaccttcttcgtcgatggagattggtgcggttggcacgataaaagcccactgcacgttacggaccgagggaaaggtctttttgtaggcctagcaac 
ggtcctcattcaccgcatgggggtgtagctcagatggtagagcgctcgcttagcatgtgagaggtaccgggatcgatacccggcatctccaaccca 
10 cacaaaacgttttttaagaagatttttagggaagatattaacgcgggtacactgtgctcctctaagttggaagagtagatgagatgatgacaagggag 
aaggaacatgtgtacgtgtttgatagcaaacacacaaacaacaata^ 
gccatcttgtccctctctctcctgttcaacte^ 

ttccacgagcctctgacataagtagccttccgcttatttccttctccttgcacttgtcagte 
acgcctgattgttacattgtcatctacattgcfficcgtttaccgttccgc 

15 GGA CTTTUjTt^^ 

attctttggagcattgmggtgttgtgctgaaaccg 
20 gtaccaattgtgtcagccccgaccgaaagcaggcctaattcgtaccagaaaaaccacaagctgtttgtaagcatcgatacgcccgaagctttcaatc 
cagccaaggcgccacctactattgacgtgactttttgcac^ 
ggagtgaagtttttatttgaacgatatcacccgtatcgatffl^ 

accaatttgaagtccgtcgtcctttgtgtccttgtgtttgtgtgtttgtgtgagctggagacatgggggagtgagtaaccgaacaacctcttgccgc 
tcacgatatcgaacagcaccaagataagcatccctttttccctagccgatgtctccgatatctcgattccgcttccagcgaggcaaagaaaaaggcg 

25 aactggctgacctcacccggggcgaggaaaaagcgtagggattacgtcgagcagcacgagttgtgatttcttcttcttctggttccataaatcgctga 
cggtttccattaccgcctgcggagtgcacacacgtgaagggaaagcgaaaacgtttagattccagcagcaacggcagcaccagaagcagcagca 
gcgcggcaaattgaatcatcctgacgcgatgagttgtctgggttttcgggtcggtggcttacagcaccacaccatctgctgcagctaatacagctgta 
aatttcgttagacatagacttgattttacaatattacacacacacttacacacacagctatagatttgtcgcttggcgtatggctctgtacggcgtgccgta 
catgccgcgagccgtgttgctgctggttgcgatacggatcacgtccgattcgattcagcctgcgtgtttttggtgaagatccttatcggtgacccacttt 

30 cagtgtgtcgagagcgagggtcactatggcgcctgtcagttggaaagctaggctcgattcaaagggccattgtgccagtgttctttttaagatagcga 
taagcttttgatcgaaatagtaaatcaaacattgtttcttttttcctattccaaactgttgccaacctcattattacgtttttgcagcgg^ 
atactttaaggcgtgattttcaaatgtagcgttccgtatgcagaaacgccatggattatgcaatttaaacaatgctgcttccttaacattcaaataacggct 
tattaaggaactttttgtgcaatttgtttttaacagcaaatagttagctcagaacgatcacatttagtatcgcttcaacaaagaactcTO 
tgtaatgccattccctcgagaaagtttcttgtcagtcctcctctgcatcacagcaacaaccaaacctgctcatgtttcctgctcgtttcctagct^ 

35 cgttatttccgattcctgtgcttgcccgcttttcttaca^ 

ggaaacaatgcgccaagctcagcatccagccatgcatgtaaaatgagccacgcgacagattttagacatcgctttcgctctgcaccggaggtggttt 
tattcttgtttccgattcccacgtccattcgtcctgggtccgtccgccgggcccgaaaccgtaagccgtgcggggaattacgcaatcgaaacgagcc 
agaaaatgagcacgccaaatgcaaagaaaatccccttttgagtggtgctcctgccaccactcatctccccaactggtgggtgaaaaaccttgtgcgc 
cccttctctttccagaaaaaaaacgcctcgctcgcacaaaaacatgctcgcccggtgaagctgcgtatgtcgcagaagctcaaaccaacgccgcca 

40 gcaagcatcaacaatttctattcaaacacccaacgcagcgcccaaaccgggtgcactgtactcagtagcgaagatgctcagattgtcccgtgcgct 
gclttcgatgcccgtttcggagcgggaagccatcgcttgccaacg^ 

cggtttgcctgcaaggttgttgcttcccacacgagcattgctttccgtaccgcggtggggcgagttttcaacgcaaccttctacaagcaacgcc 
cgcctgggagcgatatttaacagaaacaagaacatcccgaacttcagcacatgccgtgatttgcctgttggaaaagcttttgtgagcgtgtgagttga 
acgagctctattttcccagcgatgggtggcatttgtgtggcatgctatcgtcagcttttcttgaatctttacctctccattcgcctccattagtacacgcg 
45 tggaaaatgggtgcaacggatcagaacggattttccgcgacagacttaataaagggaaagcaacgcgttttttgcatgtgtagtgtttatgagctttat 
gccgttactttgcaattaaaaatagcaaaaaataacagtttttttttgtaagcggattacaaagaatgtatcagaatattacgtgaaacattcatttcatgct 
gttaacgctcaaatagaatagttttgtaacacggattgcataccttgc^ 
atgactgcgttggtagtacaatatttatttacaccgcgtaato^ 

ggaaccagtgtagcccaatgtgctcttattgaattaccacgaacaaatcaacctgatgcccgggtccgttggcaaacagcttgcgccgaagccgctc 
50 agtgtttcgtgcactaccgtgctgccattttgctgccctcatcgaacagataaacagaagggcaactcttgtgagcatcgcaatgcccgtctgaagttc 
cgtcgaaaatgggcctaaattcaatttgacgcatttacccgcgaacaattgcgcgaaggctgtcaagtgtgttccacgaactgcgacaacaagcaca 
cacacaaacacaaatgttatcgtttcggcatgtttctcggtacaaagcgtgtggcgctatgtggcatgccgattcccagacagagtgatcgatagtaa 
atgtagcctatecggtagcattcaatttccttttctatcctcgcaaacaaagcccattctggggaggcgtggtgaagcttt^ 
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atgtcctggttcggagggatgctggggaaagcaaac^^ 
tcacctgcgtatctatgcgtccgtcgtgtcgttcggato^ 

acallcgcacaaaaccgtcctccatttcaaatgcctacacttgtcactgtatatctctctttctctcgtm 

5 GCTGCTCGCCTA GCAXjGCAACGAAAAlTCGA CGGTGTGAAGGTGTAGGGATTGA CCGTAATCGG 

^gtac^gcgctcggcgtgttgccgtgggaaagcattctccctgccccatatcgctfcattctc 
cacmgcttcgccgctgccatc^ 

? 0^C7Wmm<MiUTGG 
10 OA GCAGTGCGA GAA GGCGATGA CTA^i^CGGAGCCAA GTTTTTCA CGGTTTCGCTCGATGTGT 

attttt§c^ 

ggtttaacaaaca^caacaaca_a 
15 agcaaccggggctg^aa^ 

^ggS^atctatgtatgtgtgaj^^ 

P&AtgJcglttc^^ 

cg^ta^atgatcatac^ 

&atgg_ajigtote^ 
20 acttatagttatattt^^^ 

£9^_^cggPA4gpaato 

c^.^ccattatctaaa^^^ 

gcgccagcagcaaaaaaatacatataaaaccttca^ 

aaaaaaaacacttccac^gga 
25 cgtaccgatacc^aacaaac^ 

teggtgcctgggcgaaggctagctc^cta^ 



End of Figure 11a 
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Figure lib 

Anopheles gambiae odorant receptor 7 amino acid sequence (SEQ ID NO: 20) 



MVLIQFFAILGNLATNADDVNELTANTITTLFFTHSVTKFIYFAVNSENFYRTLAIWN 
QTNTHPLFAESDARYHSIALAKMRKLLVLVMATTVLSVVAWVTITFFGESVKTVLD 
KATNETYTVDIPRLPIKSWYPWNAMSGPAYIFSFIYQVRWRNGIMRSLMELSASLDT 
YRPNSSQLFRAISAGSKSELIINEEKDPDVKDFDLSGIYSSKADWGAQFRAPSTLQTFD 
1 0 ENGRNGNPNGLTRKQEMMVRS AIKYWVERHKHVVRLVS AIGDTYGPALLLHMLTS 
TIKLTLLAYQATKIDGVNVYGLTVIGYLCYALAQVFLFCIFGNRLIEESSSVMKAAYS 
CHWYDGSEEAKTFVQIVCQQCQKAMTISGAKFFTVSLDLFASVLGAVVTYFMVLVQ 
LK 



15 
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SEQUENCE LISTING 

<110> VANDERBILT UNIVERSITY 

<12 0> MOSQUITO OLFACTORY GENES, POLYPEPTIDES, AND METHODS OF 
USE THEREOF 

<130> N8119 

<140> 
<141> 

<150> 60/264,649 

<151> 2001-01-26 

<160> 23 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 1964 
<212> DNA 

<213> Anopheles gambiae 
<400> 1 

acaggaacga cggttgtgat ccctccactg gtggtgacac gaatcataag cattatttca 60 
tacctaaaaa acaaaatcta caaaaaaaag cttcattccc atcgaaaaaa ctttcttgtg 120 
aaatcaaccg agctaacaaa caacatcctg tgcaaaatct agcagtgaaa gtgtgatatc 180 
gtatacctgt acctgtaaac cgttgtgcgc gtgtgtgcct ttgtgtatca attttgtgga 240 
aaacagaaaa tacatcaaaa tggtttacaa tttcaaagtc ttcaagaagt gcgcccctaa 3 00 
tggaaaggtt acgctgtaca tgggcaagcg tgactttgta gaccacgttt ccggcgttga 3 60 
accgatcgat ggtatcgtcg tcctcgatga tgagtacatt cgtgacaacc gtaaggtatt 420 
cggtcagatt gtctgcagtt tccgctacgg ccgcgaagag gacgaggtga tgggactaaa 4 80 
cttccagaag gagttatgcc tcgcttccga acagatctac ccgcgtccgg aaaagtcgga 54 0 
caaggagcag accaagctcc aggagcgact gctgaagaag ctgggttcga acgccatccc 600 
gttcacgttc aacatctcgc cgaatgctcc gtcttcggtc acgctgcagc agggcgaaga 660 
tgataatgga gacccgtgcg gtgtgtcgta ctacgtgaag atctttgccg gtgagtcgga 72 0 
aaccgatcgt acgcaccgtc gcagcaccgt tacgctcggc atacgcaaga tccagttcgc 78 0 
accgaccaag cagggccagc agccgtgcac gctggtgcgc aaggacttta tgctaagccc 84 0 
gggagagctg gagctcgagg tcacactaga caagcagctg tacctgcacg gggagcgaat 900 
aggcgtcaac atctgcatcc gcaacaactc gaacaaaatg gtcaagaaga ttaaggccat 96 0 
ggtccagcag ggtgtggatg tggtgctgtt ccagaatggt agctaccgca acacagtggc 102 0 
atcgctggag actagcgagg gttgcccaat tcagcccggc tccagtctgc agaaggtaat 10 8 0 
gtacctcacg ccgctgctgt cctcgaacaa gcagcgacgt ggcatcgccc tggacggtca 114 0 
gatcaagcgt caggatcagt gtttggcctc gacaaccctc ttggctcaac cggatcagcg 1200 
agatgctttc ggcgttatca tatcgtatgc cgtaaaggtt aagcttttcc tcggcgcact 12 60 
c 99 c 99 c 9 a 9 ctgtcggcgg aacttccatt tgtgctgatg cacccaaagc ccggcaccaa 132 0 
ggctaaggtc atccatgccg acagccaggc cgacgtagaa actttccgac aggatacaat 13 8 0 
cgaccagcag gcatcagttg actttgaata gacgacgcaa cggtttggaa atgctaccta 144 0 
ctaccccagg catgggctaa cacgacgaac gaactactac tactaagcat aaaaaacagg 15 0 0 
aaaaaaaatg gaaaacttaa aaaatggatc atacaaccga acgcaaacga cctacgacga 15 6 0 
tcgatctcac ttccccgtct ttttcatcct aagcaataga acgatggtag aaaaggaaga 162 0 
taaagatgga gagaaagtca cgtgtatcaa tgacgacgac taccaaaact gaagacgtaa 168 0 
cacatgttcc ccagcgagcg gtaactgttc tgttctgaca ccttccgctc gacaatgtac 1740 
cttttaaaaa catacaaatt agaagtcgtc ttcactacct tcaaccaatc cagccacttt 18 0 0 
ggtatatact tttcatagaa tccttctgag cgcaaggacc ctattgaaat tcagtgttat 18 60 
tttgtaactg cgaccaaatg cctagctgaa tgttgttgaa cgagttatgt acatcaaaag 192 0 
attgaataaa acaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaa 1964 
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<210> 2 

<211> 383 

<212> PRT 

<213> Anopheles gambiae 

<400> 2 

Met Val Tyr Asn Phe Lys Val Phe Lys Lys Cys Ala Pro Asn Gly Lys 
1 5 10 15 

Val Thr Leu Tyr Met Gly Lys Arg Asp Phe Val Asp His Val Ser Gly 
20 25 30 

Val Glu Pro lie Asp Gly lie Val Val Leu Asp Asp Glu Tyr He Arg 
35 40 45 

Asp Asn Arg Lys Val Phe Gly Gin He Val Cys Ser Phe Arg Tyr Gly 
50 55 60 

Arg Glu Glu Asp Glu Val Met Gly Leu Asn Phe Gin Lys Glu Leu Cys 
65 70 75 ~ 80 

Leu Ala Ser Glu Gin He Tyr Pro Arg Pro Glu Lys Ser Asp Lys Glu 
85 90 95 

Gin Thr Lys Leu Gin Glu Arg Leu Leu Lys Lys Leu Gly Ser Asn Ala 
100 1 105 110 

He Pro Phe Thr Phe Asn lie Ser Pro Asn Ala Pro Ser Ser Val Thr 
115 120 125 

Leu Gin Gin Gly Glu Asp Asp Asn Gly Asp Pro Cys Gly Val Ser Tyr 
130 135 140 

Tyr Val Lys He Phe Ala Gly Glu Ser Glu Thr Asp Arg Thr His Arg 
145 150 155 ~ 160 

Arg Ser Thr Val Thr Leu Gly He Arg Lys He Gin Phe Ala Pro Thr 
165 170 175 

Lys Gin Gly Gin Gin Pro Cys Thr Leu Val Arg Lys Asp Phe Met Leu 
180 185 190 

Ser Pro Gly Glu Leu Glu Leu Glu Val Thr Leu Asp Lys Gin Leu. Tyr 
195 200 205 

Leu His Gly Glu Arg He Gly Val Asn He Cys He Arg Asn Asn Ser 
210 215 220 

Asn Lys Met Val Lys Lys He Lys Ala Met Val Gin Gin Gly Val Asp 
225 230 235 240 

Val Val Leu Phe Gin Asn Gly Ser Tyr Arg Asn Thr Val Ala Ser Leu 
245 250 255 



Glu Thr Ser Glu Gly Cys Pro He Gin Pro Gly Ser Ser Leu Gin Lys 
260 265 270 
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Val Met Tyr Leu 
275 

lie Ala Leu Asp 
290 

Thr Thr Leu Leu 
305 

lie Ser Tyr Ala 



Glu Leu Ser Ala 
340 

Thr Lys Ala Lys 
355 

Phe Arg Gin Asp 
370 



Thr Pro Leu Leu 
280 

Gly Gin lie Lys 
295 

Ala Gin Pro Asp 
310 

Val Lys Val Lys 
325 

Glu Leu Pro Phe 



Val lie His Ala 

360 

Thr lie Asp Gin 
375 



Ser Ser Asn Lys 



Arg Gin Asp Gin 
300 

Gin Arg Asp Ala 
315 

Leu Phe Leu Gly 
330 

Val Leu Met His 
345 

Asp Ser Gin Ala 



Gin Ala Ser Val 
380 



Gin Arg Arg Gly 
285 

Cys Leu Ala Ser 



Phe Gly Val lie 
320 

Ala Leu Gly Gly 
335 

Pro Lys Pro Gly 
350 

Asp Val Glu Thr 

365 

Asp Phe Glu 



<210> 3 
<211> 1239 
<212> DNA 

<213> Anopheles gambiae 



<400> 3 

atgaagctga 

ttgcagttgc 

acgcggaacc 

gctctaacgc 

ttgttcgtgc 

atcgcacgga 

cgcgaagaat 

ctcatgtttg 

gaacgtcgtc 

tacggtgtac 

tcgaccgata 

ggtagtatgg 

gatgcggaat 

atgtacgcta 

cgcgcaagta 

gatgttacga 

gtgtttattt 

tttgttgggt 

tttctgcaaa 

ctaaatcttc 

cagagcatgg 



acaaactgaa 
tttgtttgaa 
ggtacatcgc 
aagccctata 
ttatgactca 
ttcaggcttg 
tcagccccgt 
tggctatctt 
tgcccgtgcc 
tgttcctgta 
ccatgttttc 
ttaaaaagct 
ggaaagagat 
aagtaacgga 
tgcgcgtctg 
tggccgatct 
tctgttacgt 
tttccaacta 
tgactcttaa 
acacattttt 
aatcagagta 



cccacggtgg 
atatttaggc 
gtacggttgg 
cttcaaggat 
agtgacgttg 
tctgcgcaag 
tttacaatcg 
caccatcatc 
ggcctggttc 
tcaaaccatt 
cggcttgatg 
tggacatgac 
gcgaaagcgc 
gtgtgtgctg 
taattatcat 
gctgggctgt 
agggaatgaa 
cttcaagttc 
agatgttcac 
gcagattatg 
atggtgttaa 



gatgcgtacg 
ctatggccac 
gctttgcgga 
gtgaaggata 
atctacaagc 
cttaactgca 
atgagtggag 
atgtgggtta 
ccggtggact 
ggaatcgtca 
ctacacataa 
gtccctcccg 
atcgaccatc 
tttcacaagg 
ttgtatgaca 
ggggtctatt 
atctcctata 
gataagcgta 
atcaaggtgg 
aagctatcgt 
tatccttaa 



atcgacggga 
cggaagatac 
tcatgtttct 
ttaatgacat 
tggaaaagtt 
cactgtatca 
tgttttggct 
tgtcgccagc 
atcaccattc 
tgagcgcaac 
atggacaaat 
aacgccaatt 
actccaaagt 
acatcttaag 
ctgctgcaac 
tgctagtaaa 
cgacggataa 
ccagccaagc 
gaagtgtctt 
actcctatct 



ttcgttctgg 
ggatcaggca 
acatctgtac 
cgcaaatgca 
taactacaac 
cccgaaacag 
gatgatcttt 
cttcgacaat 
ggacatagtg 
gtacaacttc 
tgtgcggctt 
ggtcgcaacg 
gtacggtacg 
gatctatctt 
taccgggggc 
gacatcgcaa 
atttacagag 
aatgatattt 
gaaggttacg 
ggccgtactt 



60 

12 0 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1239 



<210> 4 
<211> 394 
<212> PRT 

<213> Anopheles gambiae 
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<400> 4 

Met Lys Lys Asp Ser Phe Phe Lys Met Leu Asn Lys His Arg Trp He 
15 10 15 

Leu Cys Leu Trp Pro Pro Glu Asp Thr Asp Gin Ala Thr Arg Asn Arg 
20 25 30 

Tyr He Ala Tyr Gly Trp Ala Leu Arg He Met Phe Leu His Leu Tyr 
35 40 45 

Ala Leu Thr Gin Ala Leu Tyr Phe Lys Asp Val Lys Asp He Asn Asp 
50 55 60 

He Ala Asn Ala Leu Phe Val Leu Met Thr Gin Val Thr Leu He Tyr 
65 70 75 80 

Lys Leu Glu .Lys Phe Asn Tyr Asn He Ala Arg He Gin Ala Cys Leu 
85 90 95 

Arg Lys Leu Asn Cys Thr Leu Tyr His Pro Lys Gin Arg Glu Glu Phe 
100 105 110 

Ser Pro Val Leu Gin Ser Met Ser Gly Val Phe Trp Leu Met He Phe 
115 120 125 

Leu Met Phe Val Ala He Phe Thr He He Met Trp Val Met Ser Pro 
130 135 140 

Ala Phe Asp Asn Glu Arg Arg Leu Pro Val Pro Ala Trp Phe Pro Val 
145 150 155 160 

Asp Tyr His His Ser Asp He Val Tyr Gly Val Leu Phe Leu Tyr Gin 
165 170 175 

Thr He Gly lie Val Met Ser Ala Thr Tyr Asn Phe Ser Thr Asp Thr 
180 185 190 

Met Phe Ser Gly Leu Met Leu His He Asn Gly Gin He Val Arg Leu 
195 200 205 

Gly Ser Met Val Lys Lys Leu Gly His Asp Val Pro Pro Glu Arg Gin 
210 215 220 

Leu Val Ala Thr Asp Ala Glu Trp Lys Glu Met Arg Lys Arg He Asp 
225 230 235 240 

His His Ser Lys Val Tyr Gly Thr Met Tyr Ala Lys Val Thr Glu Cys 
245 250 255 

Val Leu Phe His Lys Asp He Leu Arg He Tyr Leu Arg Ala Ser Met 
260 265 270 

Arg Val Cys Asn Tyr His Leu Tyr Asp Thr Ala Ala Thr Thr Gly Gly 
275 1 280 285 



Asp Val Thr Met Ala Asp Leu Leu Gly Cys Gly Val Tyr Leu Leu Val 
290 295 " 300 
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Lys Thr Ser Gin 
305 

Tyr Thr Asp Lys 



Phe Asp Lys Arg 
340 

Leu Lys Asp Val 
355 

Asn Leu His Thr 
370 

Ala Val Leu Gin 
385 



Val Phe lie Phe 
310 

Phe Thr Glu Phe 
325 

Thr Ser Gin Ala 



His lie Lys Val 
360 

Phe Leu Gin lie 
375 

Ser Met Glu Ser 
390 



5/24 

Cys Tyr Val Gly 
315 

Val Gly Phe Ser 
330 

Met lie Phe Phe 
345 

Gly Ser Val Leu 



Met Lys Leu Ser 
380 

Glu Glx 



Asn Glu lie Ser 
320 

Asn Tyr Phe Lys 
335 

Leu Gin Met Thr 
350 

Lys Val Thr Leu 
365 

Tyr Ser Tyr Leu 



<210> 5 
<211> 1142 
<212> DNA 

<213> Anopheles gambiae 



<400> 5 

atgctgatcg 

tatctgcggc 

aacgttttcc 

aacggatact 

aatcgacgga 

aaaaatgacg 

atatcgaatc 

gtgcccgggc 

ccgacctacc 

tacatcccgt 

gccctaaagc 

cacagcgccg 

atccaatatg 

tcgttcggga 

cagatgataa 

tggcatgcga 

gcgtggccgg 

cagcgaccga 

ttcaaaaatt 



aagagtgtcc 
ggccgcggtt 
agttcctgaa 
ttaccgtgct 
aatttgagac 
acatccgacc 
tgtggctcgg 
gcggcctacc 
aggtcgtgtt 
tcaccagctt 
aacggctcgg 
gcacactgtt 
ttcatgatct 
tgatgctgtg 
tgattggatc 
acgaggtact 
actttgagga 
tggtggtaag 
gctcaacgtg 



gataattggt 
gtcccgcttt 
gctgtactcg 
gtactttaac 
attttttgaa 
cgtgctggag 
cgccttcatt 
gtacggcgtc 
tgtgctgcag 
ctacgcgacc 
acgcttgggg 
cgccgagctg 
caactcactc 
cgcactgctg 
gtacatcttc 
ggagcagagc 
accgataagg 
attaaagtcg 
tcctactcct 



gtcaatgtgc 
ctggtcggct 
tcctggggcg 
ctcgtcctcc 
ggcgttgccg 
cggtacacac 
agtgcctgct 
acgataccgg 
gtttacctta 
tgcacgctgt 
cgccacagcg 
aaggagtgtc 
gtcacccatc 
tttctgctaa 
atgatactct 
ctaggcattg 
aaacggttga 
gcaacgtgta 
atttcacact 



gagtgtggct 
gcatcccggt 
acatgagcga 
gaacctcctt 
ccgagtacgc 
ggcggggacg 
ttgtgaccta 
gcgtggacgt 
ccttccccgc 
ttgcgctcgt 
gcacgatggc 
taaagtatca 
tgtgtctgct 
gcattagcaa 
cgcagatgtt 
gcgatgccat 
ttctaattat 
cccgatgacg 
gctgcgccga 



gttctggtcg 
cgccgtgctg 
gctcatcatc 
tctcgtgatc 
tctcctcgag 
catgctatcg 
tcctctgttt 
gctggccacc 
ctgctgcatg 
ccagatagcg 
ttcgaccgga 
caaacaaatc 
ggagttcctg 
tcagctggca 
tgccttctat 
ttacaatgga 
tgcacgtgct 
ttggaaatgt 
gtgtacaact 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1142 



<210> 6 
<211> 380 
<212> PRT 

<213> Anopheles garabiae 
<400> 6 

Met Leu lie Glu Glu Cys Pro He He Gly Val Asn Val Arg Val Trp 
15 10 15 



Leu Phe Trp Ser Tyr Leu Arg Arg Pro Arg Leu Ser Arg Phe Leu Val 
20 25 30 
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Gly Cys lie Pro Val Ala Val Leu Asn Val Phe Gin Phe Leu Lys Leu 
35 40 45 

Tyr Ser Ser Trp Gly Asp Met Ser Glu Leu lie lie Asn Gly Tyr Phe 
50 55 60 

Thr Val Leu Tyr Phe Asn Leu Val Leu Arg Thr Ser Phe Leu Val lie 
65 70 75 80 

Asn Arg Arg Lys Phe Glu Thr Phe Phe Glu Gly Val Ala Ala Glu Tyr 
85 90 95 

Ala Leu Leu Glu Lys Asn Asp Asp lie Arg Pro Val Leu Glu Arg Tyr 
100 105 110 

Thr Arg Arg Gly Arg Met Leu Ser lie Ser Asn Leu Trp Leu Gly Ala 
115 120 125 

Phe lie Ser Ala Cys Phe Val Thr Tyr Pro Leu Phe Val Pro Gly Arg 
130 135 140 

Gly Leu Pro Tyr Gly Val Thr lie Pro Gly Val Asp Val Leu Ala Thr 
145 150 155 160 

Pro Thr Tyr Gin Val Val Phe Val Leu Gin Val Tyr Leu Thr Phe Pro 
165 170 175 

Ala Cys Cys Met Tyr lie Pro Phe Thr Ser Phe Tyr Ala Thr Cys Thr 
180 185 190 

Leu Phe Ala Leu Val Gin lie Ala Ala Leu Lys Gin Arg Leu Gly Arg 
195 200 205 

Leu Gly Arg His Ser Gly Thr Met Ala Ser Thr Gly His Ser Ala Gly 
210 215 220 

Thr Leu Phe Ala Glu Leu Lys Glu Cys Leu Lys Tyr His Lys Gin lie 
225 230 235 240 

lie Gin Tyr Val His Asp Leu Asn Ser Leu Val Thr His Leu Cys Leu 
245 250 255 

Leu Glu Phe Leu Ser Phe Gly Met Met Leu Cys Ala Leu Leu Phe Leu 
260 265 270 

Leu Ser lie Ser Asn Gin Leu Ala Gin Met lie Met lie Gly Ser Tyr 
275 280 285 

lie Phe Met lie Leu Ser Gin Met Phe Ala Phe Tyr Trp His Ala Asn 
290 295 300 

Glu Val Leu Glu Ala Ser Leu Gly lie Gly Asp Ala lie Tyr Asn Gly 
305 310 315 320 



Ala Trp Pro Asp Phe Glu Glu Pro lie Arg Lys Arg Leu lie Leu lie 
325 33 0 335 
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He Ala Arg Ala Gin Pro Thr Asp Gly Gly Lys He Lys Val Gly Asn 
340 345 350 

Val Tyr Pro Met Thr Leu Glu Met Phe Gin Lys Leu Leu Asn Val Ser 
355 360 365 

Tyr Ser Tyr Phe Thr Leu Leu Arg Arg Val Tyr Asn 
370 " 375 380 



<210> 7 
<211> 1236 
<212> DNA 

<213> Anopheles gambiae 



<400> 7 

atgccttctg 

atggtactgc 

accattgccg 

tacttctgcg 

gcggtacgcg 

ttttcctttc 

ctagtcctcc 

gtcgatcggt 

ttcatgcccg 

gtcgagcacg 

gcgcactata 

ggtggcacaa 

ctcgttgcac 

ctgaacgaga 

acattccgct 

atcctctaca 

attttggtga 

gtgctttgga 

atttcgatgc 

acggcgggaa 

tattcatttt 



agcggcttcg 
caaaattaaa 
gactgtgggg 
cgatggtggt 
gcacggccga 
aacgcgacaa 
aagacctacc 
tctccaaaat 
tctggacgac 
tgttgcacct 
cgttttatgt 
agctgctgac 
tccgaatcca 
ttatttccat 
gggtattttt 
tagcggtgac 
cggtggaaac 
gctatggcgt 
gccgcaaact 
agtttcgctt 
acgtagtact 



tctcattact 
ggatgaaaca 
tgaccgttcc 
tctacccaaa 
gctgatgttc 
ctacgagcga 
cacagagctg 
ttactgctgc 
ctattccgcc 
cgaggaagag 
ggccattatg 
cattttcagc 
ctgtctagcg 
gcatcagcgg 
cgtgcagttc 

ggggttcagc 
ttacggctac 
tgccctcgcc 
tcgactgcta 
cgtcaatgtg 
gaaggagcag 



tccttcggaa 
gcagtgatgc 
cagcggtacc 
gtgctgttcg 
gaatcgaacg 
ttggtgcatc 
ggagagtacc 
tgtcactttt 
tactttgctg 
ctgtacttcc 
tggcccacga 
aatgttaagt 
agagtagcgc 
gtactcaact 
attcagtgta 
tcgacggtag 
ggctacttcg 
atttacgata 
ctgcaacgat 
gcccagtttg 
ttttag 



ctcctcaaga 
cgtttctgct 
gtttttatct 
gttatccaga 
cattcttcgg 
agctgcagga 
tgatctcagt 
ccatggcaac 
tgcgcaacag 
tgaacattcg 
tctatacgct 
actgttcggc 
aagaccgagc 
gcgtgttcct 
caatgatctg 
cgaatgtatg 
gaacagatct 
gcgagtggta 
cccaaaaacc 
gcaagatgct 



caaacgcacg 
gcaaattcaa 
catcttttcc 
tctcgaggtt 
catgctaatg 
tctggcagct 
gaaccgacgg 
gttcttttgg 
cacggaaccg 
gacttcgatg 
cgggtttacc 
catgctgaag 
ggaaaaggag 
gctggagacg 
gtgcagtctc 
tgtccagatc 
aaccacggag 
caagttttcc 
gctcggcgta 
caagatgtcc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1236 



<210> 8 
<211> 411 
<212> PRT 

<213> Anopheles gambiae 
<400> 8 

Met Pro Ser Glu Arg Leu Arg Leu 
1 5 

Asp Lys Arg Thr Met Val Leu Pro 
2 0 

Met Pro Phe Leu Leu Gin He Gin 
35 40 



He Thr Ser Phe Gly Thr Pro Gin 
10 15 

Lys Leu Lys Asp Glu Thr Ala Val 
25 30 

Thr He Ala Gly Leu Trp Gly Asp 
45 



Arg Ser Gin Arg Tyr Arg Phe Tyr Leu He Phe Ser Tyr Phe Cys Ala 
50 55 60 
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Met Val Val Leu Pro Lys Val Leu Phe Gly Tyr Pro Asp Leu Glu Val 
65 70 75 80 

Ala Val Arg Gly Thr Ala Glu Leu Met Phe Glu Ser Asn Ala Phe Phe 
85 90 95 

Gly Met Leu Met Phe Ser Phe Gin Arg Asp Asn Tyr Glu Arg Leu Val 
100 105 110 

His Gin Leu Gin Asp Leu Ala Ala Leu Val Leu Gin Asp Leu Pro Thr 
115 120 125 

Glu Leu Gly Glu Tyr Leu lie Ser Val Asn Arg Arg Val Asp Arg Phe 
130 135 140 

Ser Lys lie Tyr Cys Cys Cys His Phe Ser Met Ala Thr Phe Phe Trp 
145 150 155 160 

Phe Met Pro Val Trp Thr Thr Tyr Ser Ala Tyr Phe Ala Val Arg Asn 
165 " 170 175 

Ser Thr Glu Pro Val Glu His Val Leu His Leu Glu Glu Glu Leu Tyr 
180 185 190 

Phe Leu Asn lie Arg Thr Ser Met Ala His Tyr Thr Phe Tyr Val Ala 
195 200 205 

He Met Trp Pro Thr He Tyr Thr Leu Gly Phe Thr Gly Gly Thr Lys 
210 215 220 

Leu Leu Thr He Phe Ser Asn Val Lys Tyr Cys Ser Ala Met Leu Lys 
225 230 235 240 

Leu Val Ala Leu Arg He His Cys Leu Ala Arg Val Ala Gin Asp Arg 
245 250 255 

Ala Glu Lys Glu Leu Asn Glu He He Ser Met His Gin Arg Val Leu 
260 265 270 

Asn Cys Val Phe Leu Leu Glu Thr Thr Phe Arg Trp Val Phe Phe Val 
275 280 285 

Gin Phe He Gin Cys Thr Met He Trp Cys Ser Leu He Leu Tyr He 
290 295 300 

Ala Val Thr Gly Phe Ser Ser Thr Val Ala Asn Val Cys Val Gin He 
305 310 315 320 

He Leu Val Thr Val Glu Thr Tyr Gly Tyr Gly Tyr Phe Gly Thr Asp 
325 330 ~ 335 

Leu Thr Thr Glu Val Leu Trp Ser Tyr Gly Val Ala Leu Ala He Tyr 
340 345 350 

Asp Ser Glu Trp Tyr Lys Phe Ser He Ser Met Arg Arg Lys Leu Arg 
355 360 365 
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Leu Leu Leu Gin Arg Ser Gin Lys Pro Leu Gly Val Thr Ala Gly Lys 
370 375 " 380 

Phe Arg Phe Val Asn Val Ala Gin Phe Gly Lys Met Leu Lys Met Ser 
385 390 395 400 

Tyr Ser Phe Tyr Val Val Leu Lys Glu Gin Phe 
405 410 



<210> 9 
<211> 3895 
<212> DNA 

<213> Anopheles gambiae 
<400> 9 

agctttgttc atttatgttg aaatctagcc cattttgtat agtgctgaac gacgaagaac 60 
atacgaaagt acctcgtccg aacactatca acattaatta taccaagcta gaagaagata 120 
tttatagtca agcctcaaca tcataggaaa ctttagcaaa accatttaat ttacatgatg 18 0 
ataagtccca cctcttaccc cagcacaggt ttgagaagga cgaaagtatc tttacgataa 24 0 
tattactcta aggtagtttt tgaataaaat aaaaatttac gtgcaagtgg tggcatcgga 3 00 
catcattcga aagaatctac taagtcatac acacacccaa gacgaccgac gtagtttcat 3 60 
ctagaaaaaa cgggtcagct ccatcgaaca cgtcaggaca taactgcgac atgcgtatgg 42 0 
tcagttccac tagtgccaac actggttcca gggcactacc ttccgaagca gtagaaccta 48 0 
atgtattgga aattattagg acatactgca acatgcatat ggctagttcc gctggtacca 540 
acgatggcac caggacacta tctgcggcct tgtaaaatca ctgtaaaatc tatacaaaaa 60 0 
cggctttacc catactttat cacaaaaacg gcaggtgagg gctggattgc ttcaaagcat 660 
tagaaatata taatttcaaa gtccataatc tccttaaaag atagacaaca gtagagaaca 72 0 
catttagtgc tcttttcgtt cgagttagtt gccttctcaa gtaagcgttt aatgctcaat 780 
tgttgtagat tcgttggatg actctcgcta cgtgctafcag tggtcaatac ttccaattag 840 
atttcataat tagtttccaa ttgtccacgg aaaacccaca aaagaaaaaa aaacttgtat 900 
ctagggtgga atttttcgag aacaattgga cacttcatat gaaaaaggac agctttttca 960 
aaatgttaaa taaacaccgt tggatccttt gttggatttc aattctccaa attctgcaga 1020 
ataattctgc aaattttaca aaactgctca accaccaata attccaatta atcatctgaa 108 0 
catttaaaac tgataattaa gatgagtaat tgcttcgtca tcacctaaga aatcgattag 114 0 
tttggataaa aagaacaaat tgaaatacaa taaagtccct gaattttatt cgaataacgg 120 0 
cttgaactca tttatttcaa aaacctttga gaaattcctc gttgaaaatt ggtctcctat 1260 
agttctgcta acgggccact tcaaaagcaa gaactaacaa aatcataatt atggtgcaag 132 0 
taactatcag taccagtaat cgccattaaa aacttttcct caatttgcgg ctcgttaccg 1380 
gctaaataca gagcagagta acgggaagtg atcaacgtcg ctattagtat aacgaggaac 144 0 
gccctccgaa ggtgtgttga aggacetttt caaattgaaa ccaagtactg tttccagttt 150 0 
taaattggat agttataaaa tgagccgttc aacgatcggg catcatttga gtttcatctt 156 0 
cgaggagaaa tagatcagtg ccactgttta accgaaagta atgaagctga acaaactgaa 162 0 
cccacggtgg gatgcgtacg atcgacggga ttcgttctgg ttgcagttgc tttgtttgaa 1680 
atatttaggc ctatggccac cggaagatac ggatcaggca acgcggaacc ggtacatcgc 174 0 
gtacggttgg gctttgcgga tcatgtttct acatctgtac gctctaacgc aagccctata 1800 
cttcaaggat gtgaaggata ttaatgtgag tctctagtta gctattagtg ttccacctgt 1860 
ccataatctg tcttttattg ggtaggacat cgcaaatgca ttgttcgtgc ttatgactca 192 0 
agtgacgttg atctacaagc tggaaaagtt taactacaac atcgcacgga ttcaggcttg 198 0 
tctgcgcaag cttaactgca cactgtatca cccgaaacag cgcgaagaat tcaggtaagc 204 0 
ctgctgggaa atatgactaa aaagagtgct aacaaacgac tctcctccaa atgtagcccc 210 0 
gttttacaat cgatgagtgg agtgttttgg ctgatgatct ttctcatgtt tgtggctatc 2160 
ttcaccatca tcatgtgggt tatgtcgcca gccttcgaca atgaacgtcg tctgcccgtg 222 0 
ccggcctggt tcccggtgga ctatcaccat tcggacatag tgtacggtgt actgttcctg 22 8 0 
tatcaaacca ttggaatcgt catgagcgca acgtacaact tctcgaccga taccatgttt 234 0 
tccggcttga tgctacacat aaatggacaa attgtgcggc ttggtagtat ggttaaaaag 2400 
gtgagttacg gcgactactt gcctccagta aggacaggga gtttgtttcc gttatgatat 24 6 0 
cattttatca gcttggacat gacgtccctc ccgaacgcca attggtcgca acggatgcgg 252 0 
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aatggaaaga gatgcgaaag cgcatcgacc atcactccaa agtgtacggt acgatgtacg 2580 
ctaaagtaac ggagtgtgtg ctgtttcaca aggacatctt aaggtacgaa ttgggccaat 2 640 
taattgtgtc atttaaaaag cttgacccaa cttttcacag cttcggcgat gaagtgcagg 2 700 
acattttcca aggatctatc ttcgcgcaag tatgcgcgtc tgtaattatc atttgtatga 2 760 
cactgctgca actaccgggg gcgatgttac gatggccgat ctgctgggct gtggggtcta 2820 
tttgctagta aagacatcgc aagtgtttat tttctgttac gtagggaatg aaatctccta 2880 
tacggtaggt tggacacgta gaggaattaa atgtttggga agaatatcaa taccaaatag 2 940 
tatgatgttt cgttacagac ggataaattt acagagtttg ttgggttttc caactacttc 3000 
aagttcgata agcgtaccag ccaagcaatg atattttttc tgcaaatgtg agatagcggt 3 060 
gtatttgtgc agtcagtaca ttaaatacgt tctctatttc aggactctta aagatgttca 3120 
catcaaggtg ggaagtgtct tgaaggttac gctaaatctt cacacatttt tgcaggtatg 3180 
taattatgct gtggtattta gcttgaaata agctacaaac tttgaaagta atttcaatct 3240 
gttttgtaga ttatgaagct atcgtactcc tatctggccg tacttcagag catggaatca 33 00 
gagtaatggt gttaatatcc ttaatgttga aattatattt tgttagattt attgcataaa 3360 
gtaatattta attttataca tcaaacgtaa gcccgctagt tttcaattag ccttttccaa 3420 
aatttatcaa attgatttcg aattgattgc agagtttcag gaatttaatc tgataggata 3480 
tcttgtttat ccaatagagg tgtggaagcg ttcccaagcc attcgtttga tagtttatag 3540 
caccgtcgag cagttgatcg ctgtgatcgc taggcgcacc tgattttatc tttatctcgc 3600 
acctgttatg gcaagggcgc ttttcacacg tttcacacaa tataatgcac atgtataatg 3 660 
cattcttact ttagcatttt tgttacatat aataccaaaa ttatgcattt ttattctcac 3 72 0 
gcaacgatta gaggatgact tcacaaaggt ccatctagtg gtaggaggta tacaattata 3 78 0 
cctctcaaaa tctcacagca taatgagaaa caaaaggata ccaagcatac ccttttttta 3840 
cttgacaatt tcatttgatt tatgtaataa agcactgcac gtcgacttcc taaaa 3 895 



<210> 10 
<211> 4985 
<212> DMA 

<213> Anopheles gambiae 
<400> 10 

gggatcctct agagtcgacc tgcaggcatg caagcttccc tcaccgtgac gtgctagaaa 60 
tggttcaaca tactcgtccg gcagagcgaa gacgacgaac agcggaatgt cccaggaaat 120 
gtaatgagat atcacagcaa gtgaacccaa accgagctgt gcgctttgtg ttgcgcttta 180 
aaaatggccc ttccttcgcc gcatctgctt ggtttcacac gctttcccag gaaatccact 240 
gaccactggc cacacatcaa ccaccggagc gggagcctca gtgcccagcg aagcatataa 3 00 
tttgctcaaa aagtcacggt actcaattaa tttgattata atcaatttcg tggcttccaa 3 60 
cacacccttc ttccacaatc catcgccgag tgagcgagta taaaggtgaa gaaacgtacc 420 
ttgcgcttgc tcactaactg aaccggattt caaaaaggaa cataaaccgc aacccacagc 480 
cgaaaatgct gatcgaagag tgtccgataa ttggtgtcaa tgtgcgagtg tggctgttct 540 
ggtcgtatct gcggcggccg cggttgtccc gctttctggt cggctgcatc ccggtcgccg 6 00 
tgctgaacgt tttccagttc ' ctgaagctgt actcgtcctg gggcgacatg agcgagctca 660 
tcatcaacgg atactttacc gtgctgtact ttaacctcgt cgtacgtggg cgaggggagg 720 
ggcaataacc ttcccacttg gtggatattt tcataccttt tccatgtgtt tttttattct 780 
ctgtttgttg ccatccagct ccgaacctcc tttctcgtga tcaatcgacg gaaatttgag 84 0 
acattttttg aaggcgttgc cgccgagtac gctctcctcg aggtaagtca ttggtttttc 900 
tagtttttgg gggagttgtt tacaccataa ccacccccga cggtaacatt tgatcgtccc 960 
gcgaaaatgt ttgtacagaa aaatgacgac atccgacccg tgctggagcg gtacacacgg 1020 
cggggacgca tgctatcgat atcgaatctg tggctcggcg ccttcattag tgcctgcttt 1080 
gtgacctatc ctctgtttgt gcccgggcgc ggcctaccgt acggcgtcac gataccgggc 1140 
gtggacgtgc tggccacccc gacctaccag gtcgtgtttg tgctgcaggt ttaccttacc 1200 
ttccccgcct gctgcatgta catcccgttc accagcttct acgcgacctg cacgctgttt 1260 
gcgctcgtcc agatagcggc cctaaagcaa cggctcggac gcttggggcg ccacagcggc 132 0 
acgatggctt cgaccggaca cagcgccggc acactgttcg ccgagctgaa ggagtgtcta 13 8 0 
aagtatcaca aacaaatcat ccagtaagta gacgctagta gactcgaccg gattgccctt 1440 
ccctcgggga ggggaggttt gctatttcgg gatgcggcag cacgcataca cacaaaccgg 15 0 0 
aagccattaa ttctcccgtt ttcatgcccg cacgggcact gggtcatgtt tcacatcctt 1560 
ccttcctttc caaacacaca cacgcgcgcg tgcacgtaca gatatgttca tgatctcaac 1620 
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tcactcgtca cccatctgtg tctgctggag 
ctgctgtttc tgctaagcat tgtaagtaaa 
ccggactctc atttcgggac tcaatcgttc 
gatgataatg attggatcgt acatcttcat 
gcatgcgaac gaggtactgg agcaggtaat 
cgctatagat cggctgtctt acattgttgt 
ctctccattt cagagcctag gcattggcga 
tgaggaaccg ataaggaaac ggttgattct 
ggtaagtttg gctgatcgat gctctgttca 
gctgttcatt aataagtttt ttcagaatgt 
ctatgcaatg gtagcaacaa tagaccgcct 
ttttatttta agagaaagat aaaccatttt 
tacagaattt attattatta ttattattat 
tattattatt attattatta ttattattat 
tattattatt attattatta atattattat 
aattattact tttattatta ttattattat 
tattattatt attataatta tgattattat 
aacaataata attattatta ttatttatta 
tattattgtt attcattatt atacattatt 
ttattattat tattattatt attattatta 
ttattattat taatattatt tttaatatta 
ttattttttt ttattattat tattattatt 
gctattgtta ttattattct tattattgct 
gttgttgttg ttcttattat tgttgttgtt 
ttttttttat tctctaatta ttccagtaat 
agtaaatagt aaataattcc agtaactgta 
ttgcattttg taatgaaata tgttgattgt 
tcagcattaa acagttttga ggttgttcag 
aagcatttgt tttcattact acaaaaaagc 
ctaaacgcct atgtgtatgc aattacataa 
ttagtaatct aaatccaatc tcttctttcc 
ccgatgacgt tggaaatgtt tcaaaaattg 
ctgcgccgag tgtacaacta aacttaaccg 
gcaaagacag caagcagccg atcatcaaac 
ttatcccacg ggatttggtg gaaagttatt 
ggaggttccc tctcaaccaa cccattgaag 
tgaaaaaacg ctgcattatt gtgcttgctt 
tccattcaaa agtcgcgatg ctcacgatac 
cactcgcaag ccggtgatgt tgccggtgga 
cgtttgttcg cgtaaatggg agggaaaaaa 
aattgaaact caagccaacg aacatgcaga 
aaaggtctct gctccggggc atggattctt 
aggtttttat tttacaaatt catatccttc 
gccagacaga tgtgcggcgg gcaacaaaac 
tctatctcat ctctgtgtcg cactgtctcg 
attgttttag tccacgggtt tacttctaat 
ttgctcgttc cggttgcaac ttcgacaagc 
tactccaccc actactacta ctactgccac 
gcttgcagac ccacaagcaa acaacgatac 
cagccgacgg tacaaggttt aaccggtaca 
acaaggcacg gggccgcatc cggcagtacg 
ctgtaattgt caatcgctgc tacaagttgt 
ccgatggtga tggtgtaaaa gatagataca 
ggtgtggtta gcaaatttga tttccactga 
ttgccattca gggtaaagtt gctcgtggac 
aagcttatcg ataccgtcga cctcgagggg 
gtgga 



11/24 

ttcctgtcgt tcgggatgat gctgtgcgca 168 0 
atcgaccgac gtgcggtcgc tagtccgtct 1740 
catctctcaa tagagcaatc agctggcaca 1800 
gatactctcg cagatgtttg ccttctattg 1860 
ggcgctgaag ctgagtttgg ttgagcggtt 192 0 
gtttctgcat ggggatcggt tttgtttttc 198 0 
tgccatttac aatggagcgt ggccggactt 2040 
aattattgca cgtgctcagc gaccgatggt 2100 
atgaacatgg cacagaaggc tgtgtaaata 2160 
atcgttttta gttgatttaa acgcattgtt 2220 
ttattaatcc aagcttcctt taggattgat 2280 
tagtaaccaa tttagttaca ggaaccaaaa 234 0 
tattattatt attattatta ttattattat 2400 
tattataatt attattatta ttattattat 2460 
tattattatt attattacta ttattattat 2520 
tattattatt attattatta ttattattat 2580 
tattattatt attattatta ttattattat 2640 
ttaattaatt aatttattat tattaattat 2 7 00 
atcataataa taattttatt atgattatta 2760 
ttattattat tcttattatt attattatta 2820 
ttattattat tattactatt cttattataa 2880 
attattatta ttattattat tattattatt 2 940 
attgttatta ttattattct tattattgtt 3 000 
gttattctta ttattgttta ttattattgt 3 060 
ccataataaa aaataataaa gtaaataaat 3120 
gtaatacaca ataatctcta agaattaaaa 3180 
tcgaatagtt cagaaaaact taaaaatgcc 3240 
ggcatttagt ttagatattt tagtatttta 3 3 00 
aaatttatga gtgaattact ttcagttctt 33 60 
caatagctct cttttttatt gcatttttcc 3420 
ctcttgcaga ttaaagtcgg caacgtgtac 3480 
ctcaacgtgt cctactccta tttcacactg 3 540 
gtaaacaaac aaaaatcccc tcatcactat 3600 
accattagca gccacaaagt taccagccgc 3 6 60 
gcactgaagc tctttcaccc aaattttcat 3720 
cgaataaaag tatcagcaac caggcgacgg 3 780 
cagcattcca gcgaatgact cttaaacttt 3 840 
ggagcggtgt gttgttcgat ccgccgagtg 3 9 00 
aatgcacaga tcgacacagc gatagataat 3 9 60 
gtaagctgcc agctacttca tttccatgtt 4020 
acccggttgg ttgtgtgtct ccgctccggg 4080 
tccccctccg ggtggttggg ggtattgttt 4140 
cgcttccgca tcagccgacc cggtgggtgc 42 00 
tatgcacgaa catggccaac aaacacagct 4260 
ctttcccgct gcgttgcttg tagtactatc 4320 
tccattgcac cacgcaaaaa ggctcatcct 43 80 
gcatggttgg gatacgaaca aaaaaccaac 444 0 
caccactaac aacactacac ttggttggga 45 0 0 
aagctagcta gctgctgtgt gcgctcgagt 45 60 
agcaactccc ggaccgatcc caaaactctg 4 62 0 
gtcggaaaac atggaaatgt ttaattaaaa 468 0 
gacacaggga gagagagaga cagagcgcgc 4 74 0 
ggaaaagagc gagaaacatt ggtacgattt 48 00 
ttttgagtgc aaatttaatg catcgaaaat 4860 
ggatcccccg ggctgcagga attcgatatc 4920 
gggcccggta cccagctttt gttcccttta 4980 

4985 
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<210> 11 
<211> 2083 
<212> DNA 

<213> Anopheles gambiae 
<400> 11 

aagcagaaca catcaagaag caattaggtg tgtcgtacgt tagcaagtag ttcgcgagga 6 0 
ggaataaaat agatgccttc tgagcggctt cgtctcatta cttccttcgg aactcctcaa 120 
gacaaacgca cgatggtact gccaaaatta aaggatgaaa cagcagtgat gccgtttctg 18 0 
ctgcaaattc aaaccattgc cggactgtgg ggtgaccgtt cccagcggta ccgtttttat 24 0 
ctcatctttt cctacttctg cgcgatggtg gttctaccca aagtgctgtt cggttatcca 3 00 
gatctcgagg ttgcggtacg cggcacggcc gagctgatgt tcgaatcgaa cgcattcttc 3 60 
ggcatgctaa tgttttcctt tcaacgcgac aactacgagc gattggtgca tcagctgcag 420 
gatctggcag ctctaggtga gtatgcagcc aatcgattgt tccaaacctt cgcaacatcc 480 
ttcgtaacac tgctacactt tcagtcctcc aagacctacc cacagagctg ggagagtacc 540 
tgatctcagfc gaaccgacgg gtcgatcggt tctccaaaat ttactgctgc tgtcactttt 600 
ccatggcaac gttcttttgg ttcatgcccg tctggacgac ctattccgcc tactttgctg 660 
tgcgcaacag cacggaaccg gtcgagcacg tgttgcacct cgaggaagag ctgtacttcc 72 0 
tgaacattcg gacttcgatg gcgcactata cgttttatgt ggccattatg tggcccacga 780 
tctatacgct cgggtttacc ggtggcacaa agctgctgac cattttcagc aatgttaagt 84 0 
actgttcggc catgctgaag ctcgttgcac tccgaatcca ctgtctagcg agagtagcgc 90 0 
aagaccgagc ggaaaaggag ctgaacgaga ttatttccat gcatcagcgg gtactcaagt 960 
aagtaaattc aaattgaaag ttttgcaggg aataacttga gtgtgtctga cccgtgcaca 102 0 
tcctagctgc gtgttcctgc tggagacgac attccgctgg gtatttttcg tgcagttcat 108 0 
tcagtgtaca atgatctggt gcagtctcat cctctacata gcggtgacgg taatagcatt 114 0 
ttcgtcattt cgttagcctt attcaatcca tttttgtgaa cgtgaatttc ccccaggggt 12 0 0 
tcagctcgac ggtagcgaat gtatgtgtcc agatcatttt ggtgacggtg gaaacttacg 1260 
gctacggcta cttcggaaca gatctaacca cggaggtgct ttgggtaccc tttggatgaa 1320 
gcttcaaaaa gtaattccaa attctgtttt cgatttttcc ccttttccac tagagctatg 13 8 0 
gcgttgccct cgccatttac gatagcgagt ggtacaagtt ttccatttcg atgcgccgca 144 0 
aacttcgact gctactgcaa cgatcccaaa aaccgctcgg cgtaacggcg ggaaagtttc 15 0 0 
gcttcgtcaa tgtggcccag tttggcaagg taacattaat tacagtttga aaattctgaa 1560 
gaatgcatct tacttgcctt acttgttgtt ccagatgctc aagatgtcct attcatttta 162 0 
cgtagtactg aaggagcagt tttaggagct gctgtttccc accctggaaa tggccttttc 168 0 
gcactgtctt ctgtttgttg gacgcacgca gcaccgagag cgcccctgca cgcactgacg 1740 
tattttggct actttgacgt ttgcaccttt gacagctgaa ggacagggta caatttttgc 18 00 
tgctgttatt acgcgcagcg cattggatac gaaaacattg gccacaagtt ctacgatttt 18 60 
agcgtttatt tactgttcgt agcagctttt ttccacaata aacacacaca ataacgtacc 192 0 
gacagtattc ttttcattgt aggatagaga agccgccggc cagcagccaa aacgcgccgc 198 0 
aaaacgaaag gcggcaccac cgggggaaaa acacgggagc aaaacgagaa cagaacgcag 2 04 0 
taaacaacaa aaccggccgg aacaacaacg gtgccggaaa cga 2 083 



<210> 12 
<211> 2374 
<212> DNA 

<213> Anopheles gambiae 
<400> 12 

ggggaactcc cccacccgac cagacgacgg 

tagtagcgtt tttgctcgca aacgaactaa 

ggacaaaaat caataaatta aatcgagacc 

actgattttc atttcgttcc atcgactaca 

aagtttttgt atcattttta aacaacacaa 

gtatcaacaa tgaagtttga actgtttcaa 

ttcgtgctaa ggcttttgca tatcgtgggc 

cgagttggtg gcatttttct gttctattta 



aaagctaacg atgtgcaatt gaatagtcat 60 
ccctttgact ttttaagttc actacggtga 12 0 
gttgatgagc aaaagaaaaa aaaatatttt 18 0 
taatcataat tatatgccac attttattat 24 0 
aaatgcatcc tttcgaatat tagtcaggtt 3 00 
aaatattcct ccccggacac ggtcttatcc 360 
atgaatgggg caggatttcg gtcgcgaatt 42 0 
atctttcttg taataccgcc actaacgggc 48 0 
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gggtacaccg 
atttacggcg 
gaactgaaga 
tcatcatttg 
ccggttcaac 
aacgcttttc 
caattccacc 
gaatcgcacc 
tatgtgtggt 
tacactgtac 
ggaacgaact 
cattaggctt 
gttaaactgg 
aaatccaaaa 
atcggttacc 
acttgggacg 
gtggtacaac 
aaggcatgca 
gattgtcaac 
acagtgaatg 
tctagaattt 
aaccaattct 
ccaagttctg 
tgctaaatat 
cctaccactg 
aacactaatg 
attgatacgt 
cgtgaaatgt 
acgttgtttg 
tcttctaatt 
caaacgatac 
gttagcaagt 



atggtcacca 
gcagtatgtt 
gcctttcggt 
tttctctttg 
cgtcgagcgg 
tactggattg 
gaacccgtgc 
tcagtcgagg 
tacgtatgca 
accaggatga 
gccagcgcca 
ttgaacacgt 
agcatttctc 
acaaaaagat 
gtggtggtaa 
cggcttgcga 
tacccaatag 
cacataacgg 
ttatcctact 
tttttttttt 
aatctagatg 
caaaacaatt 
gaactacata 
ttgtccaaat 
gaactgatta 
ttccaactca 
atgtattact 
tgagaaaaag 
ctatttaagt 
aaaccctttc 
ctaccttaaa 
agttcgcgag 



gcgtgtacgc 
ctttgcctac 
tttgggtaat 
cagtatgctc 
atattatcgc 
caccgatacc 
ggtttgtgca 
actacataac 
atttgaaggt 
ctatagagat 
tacgcaacgt 
caatccgatc 
tcatctatct 
ggctataatt 
tgttttttct 
cacaacagca 
cctttcgcag 
tggggaagtt 
ctgcttacgt 
tggcttggca 
tcataatatg 
ttgaacttag 
ttaccgaatc 
gttggtcctg 
attgatgtag 
ttatttcaag 
ccatttcctg 
attcttattt 
atggccaggc 
gattgatagt 
cggagcagaa 
gaggaataaa 



accagtgtgg 
gatgtggcca 
atttaattaa 
acattcgtac 
caaagtgcaa 
ttccatctgt 
acatttagag 
cttcgtgctg 
gatgaccatc 
ggtagagcag 
ggggcagatg 
gatgctgatg 
aacgaacgtg 
gaactttcta 
tgccactgcg 
gctgctggag 
cagcattagg 
ttttcgcgtt 
cgtacttaag 
acgaatgaag 
atcttggcca 
ggcgaggcat 
tatcccatta 
gacctatcca 
gaagtcatgg 
ggcaattcta 
gactttgtct 
atgagtgata 
taatcacaat 
gtcaatgtca 
cacatcaaga 
atag 



aattcctgtt 
ctttccaagc 
ttaaaattgc 
agactaaagt 
acgacctgca 
gcgcactact 
gtgaagttct 
atcatgctac 
tgctgcagca 
ttggaaagca 
cacagtggtt 
ctgcagtggt 
gttagttttg 
ttacagggca 
gaaactttcc 
cacgcactct 
atgatgttga 
aatttggaag 
gatgtaataa 
ttttccgaat 
tgaccggttc 
gaaatgtccc 
ttgcctcgga 
gacaaagatc 
aggtgttcag 
ttttttatat 
tattcttgct 
cagagccttt 
cgctactaat 
atgtcgagat 
agcaattagg 



taattgcaat 
gttcatccag 
gtttattgca 
ataagctgac 
tgggtgctgt 
acaggtcgac 
attggctcga 
ccgtcgtggt 
ttggacactg 
tggcatcagc 
tactgaaatg 
tgacctgcgt 
tcttgtttgg 
tctcgctaca 
tgtattgttt 
atgctacacg 
gacagtcgca 
aatttagcag 
agatggatgt 
ctatattaga 
ctggttttgg 
aagaacctat 
actggtttgg 
ttcaattatt 
ggagaattta 
gcccctacgg 
gctgattgga 
aaatactcct 
gaacagaatc 
aattgaactg 
tgtgtcgtac 



540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2374 



<210> 13 
<211> 1194 
<212> DNA 

<213> Anopheles gambiae 
<400> 13 

atgaagtttg aactgtttca aaaatattcc tccccggaca cggtcttatc cttcgtgcta 60 
aggcttttgc atatcgtggg catgaatggg gcaggatttc ggtcgcgaat tcgagttggt 12 0 
ggcatttttc tgttctattt aatctttctt gtaataccgc cactaacggg cgggtacacc 18 0 
gatggtcacc agcgtgtacg caccagtgtg gaattcctgt ttaattgcaa tatttacggc 240 
ggcagtatgt tctttgccta cgatgtggcc actttccaag cgttcatcca ggaactgaag 3 00 
agcctttcgg ttttggtatg ctcacattcg tacagactaa agtataagct gacccggttc 360 
aaccgtcgag cggatattat cgccaaagtg caaacgacct gcatgggtgc tgtaacgctt 42 0 
ttctactgga ttgcaccgat accttccatc tgtgcgcact actacaggtc gaccaattcc 480 
accgaacccg tgcggtttgt gcaacattta gaggtgaagt tctattggct cgagaatcgc 540 
acctcagtcg aggactacat aaccttcgtg ctgatcatgc tacccgtcgt ggttatgtgt 600 
ggttacgtat gcaatttgaa ggtgatgacc atctgctgca gcattggaca ctgtacactg 66 0 
tacaccagga tgactataga gatggtagag cagttggaaa gcatggcatc agcggaacga 72 0 
actgccagcg ccatacgcaa cgtggggcag atgcacagtg gtttactgaa atgcattagg 780 
cttttgaaca cgtcaatccg atcgatgctg atgctgcagt ggttgacctg cgtgttaaac 84 0 
tggagcattt ctctcatcta tctaacgaac gtgggcatct cgctacaatc ggttaccgtg 900 
gtggtaatgt tttttcttgc cactgcggaa actttcctgt attgtttact tgggacgcgg 960 
cttgcgacac aacagcagct gctggagcac gcactctatg ctacacggtg gtacaactac 1020 
ccaatagcct ttcgcagcag cattaggatg atgttgagac agtcgcaaag gcatgcacac 10 8 0 
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ataacggtgg ggaagttttt tcgcgttaat ttggaagaat ttagcaggat tgtcaactta 1140 
tcctactctg cttacgtcgt acttaaggat gtaataaaga tggatgtaca gtga 1194 



<210> 14 
<211> 412 
<212> PRT 

<213> Anopheles gambiae 



<400> 14 
Met Lys Phe Glu 
1 

Ser Phe Val Leu 
20 

Phe Arg Ser Arg 
35 

Phe Leu Val lie 
50 

Arg Val Arg Thr 
65 

Gly Ser Met Phe 



Gin Glu Leu Lys 
100 



Leu Phe Gin Lys 
5 

Arg Leu Leu His 



lie Arg Val Gly 
40 

Pro Pro Leu Thr 
55 

Ser Val Glu Phe 
70 

Phe Ala Tyr Asp 
85 

Ser Leu Ser Val 



Tyr Ser Ser Pro 
10 

He Val Gly Met 
25 

Gly He Phe Leu 



Gly Gly Tyr Thr 
60 

Leu Phe Asn Cys 
75 

Val Ala Thr Phe 
90 

Leu Val Cys Ser 
105 



Asp Thr Val Leu 
15 

Asn Gly Ala Gly 
30 

Phe Tyr Leu He 
45 

Asp Gly His Gin 



Asn He Tyr Gly 
80 

Gin Ala Phe He 
95 

His Ser Tyr Arg 
110 



Leu Lys Tyr Lys Leu 
115 

Lys Val Gin Thr Thr 
130 

Ala Pro He Pro Ser 
14 5 

Thr Glu Pro Val Arg 
165 

Leu Glu Asn Arg Thr 
180 

Met Leu Pro Val Val 
195 

Met Thr He Cys Cys 
210 



Thr Arg Phe Asn Arg 
12 0 

Cys Met Gly Ala Val 
135 

He Cys Ala His Tyr 
150 

Phe Val Gin His Leu 
170 

Ser Val Glu Asp Tyr 

* 185 

Val Met Cys Gly Tyr 
200 

Ser He Gly His Cys 
215 



Arg Ala Asp He He Ala 
125 

Thr Leu Phe Tyr Trp lie 
140 

Tyr Arg Ser Thr Asn Ser 
155 160 

Glu Val Lys Phe Tyr Trp 
175 

He Thr Phe Val Leu He 
190 

Val Cys Asn Leu Lys Val 
205 

Thr Leu Tyr Thr Arg Met 
22 0 



Thr He Glu Met Val Glu Gin Leu Glu Ser Met Ala Ser Ala Glu Arg 

225 230 235 240 

Thr Ala Ser Ala He Arg Asn Val Gly Gin Met His Ser Gly Leu Leu 

245 250 255 



WO 02/059274 



PCT7US02/02549 



L Y S C Y S H e Ar 9 
260 

Gin Trp Leu Thr 
275 

Thr Asn Val Gly 
290 

Phe Leu Ala Thr 
305 

Leu Ala Thr Gin 



Trp Tyr Asn Tyr 
340 

Arg Gin Ser Gin 
355 

Val Asn Leu Glu 
370 

Tyr Val Val Leu 
385 

Tyr Ser Tyr Phe 



Leu Leu Asn Thr 



Cys Val Leu Asn 
280 

lie Ser Leu Gin 
295 

Ala Glu Thr Phe 
310 

Gin Gin Leu Leu 
325 

Pro lie Ala Phe 



Arg His Ala His 
360 

Glu Phe Ser Arg 
3 75 

Lys Asp Val lie 
390 

Thr Leu Leu Arg 
405 
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Ser lie Arg Ser 
265 

Trp Ser lie Ser 



Ser Val Thr Val 
300 

Leu Tyr Cys Leu 
315 

Glu His Ala Leu 
330 

Arg Ser Ser lie 
345 

He Thr Val Gly 



He Val Asn Leu 
380 

Lys Met Asp Val 
395 

Arg Val Tyr Asn 
410 



Met Leu Met Leu 
270 

Leu He Tyr Leu 
285 

Val Val Met Phe 



Leu Gly Thr Arg 
320 

Tyr Ala Thr Arg 
335 

Arg Met Met Leu 
350 

Lys Phe Phe Arg 
365 

Ser Tyr Ser Ala 



Gin Asn Val Ser 
400 



<210> 15 
<211> 1176 
<212> DNA 

<213> Anopheles gambiae 
<400> 15 

atggtgctac cgaagctgtc cgaaccgtac gccgtgatgc cgcttctact acgcctgcag 60 
cgtttcgttg ggctgtgggg tgaacgacgc tatcgctaca agttccggtt ggcattttta 12 0 
agcttctgtc tgctagtagt tattccgaag gttgccttcg gctatccaga tttagagaca 180 
atggttcgcg gaacagctga gctgattttc gaatggaacg tactgtttgg gatgttgctg 24 0 
ttttctctca agctagacga ctatgatgat ctggtgtacc ggtacaagga catatcaaag 300 
attgctttcc gtaaggacgt tccctcgcag atgggcgact atctggtacg catcaatcat 360 
cgtatcgatc ggttttccaa gatctactgc tgcagccatc tgtgtttggc catcttctac 42 0 
tgggtggctc cttcgtccag cacctaccta gcgtacctgg gggcacgaaa cagatccgtc 48 0 
ccggtcgaac atgtgctaca cctggaggag gagctgtact ggtttcacac ccgcgtctcg 54 0 
ctggtagatt actccatatt caccgccatc atgctgccta caatctttat gctagcgtac 600 
ttcggtggac taaagctgct aaccatcttc agcaacgtga agtactgttc ggcaatgctc 660 
aggcttgtgg cgatgagaat ccagttcatg gaccggctgg acgagcgcga agcggaaaag 72 0 
gaactgatcg aaatcatcgt catgcatcag aaggcgctaa aatgtgtgga gctgttggaa 78 0 
atcatctttc ggtgggtttt tctgggacag ttcatacagt gcgtaatgat ctggtgcagc 84 0 
ttggttctgt acgtcgccgt tacgggtctc agcacaaaag cggcaaacgt gggtgtactg 900 
tttatactgc taacagtgga aacctacgga ttctgctact ttggcagtga tcttacctcg 960 
gaggcaagtt gttattcgct gacacgtgct gcgtacggta gcctctggta tcgccgttcg 10 2 0 
gtttcgattc aacggaagct tcgaatggta ctgcagcgtg cccagaaacc ggtcggcatc 1080 
tcggctggga agttttgctt cgtcgacatt gagcagtttg gcaatatggc aaaaacatca 1140 
tactcgttct acatcgttct gaaggatcaa ttttaa 1176 
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<210> 16 
<211> 391 
<212> PRT 

<213> Anopheles gambiae 
<400> 16 

Met Val Leu Pro Lys Leu Ser Glu Pro Tyr Ala Val Met Pro Leu Leu 
15 10 15 

Leu Arg Leu Gin Arg Phe Val Gly Leu Trp Gly Glu Arg Arg Tyr Arg 
20 25 30 

Tyr Lys Phe Arg Leu Ala Phe Leu Ser Phe Cys Leu Leu Val Val lie 
35 40 45 

Pro Lys Val Ala Phe Gly Tyr Pro Asp Leu Glu Thr Met Val Arg Gly 
50 55 60 

Thr Ala Glu Leu lie Phe Glu Trp Asn Val Leu Phe Gly Met Leu Leu 
65 70 75 80 

Phe Ser Leu Lys Leu Asp Asp Tyr Asp Asp Leu Val Tyr Arg Tyr Lys 
85 90 ~ ~ 95 

Asp lie Ser Lys lie Ala Phe Arg Lys Asp Val Pro Ser Gin Met Gly 
100 " 105 110 

Asp Tyr Leu Val Arg lie Asn His Arg lie Asp Arg Phe Ser Lys He 
115 120 125 

Tyr Cys Cys Ser His Leu Cys Leu Ala He Phe Tyr Trp Val Ala Pro 
130 135 140 

Ser Ser Ser Thr Tyr Leu Ala Tyr Leu Gly Ala Arg Asn Arg Ser Val 
145 150 155 160 

Pro Val Glu His Val Leu His Leu Glu Glu Glu Leu Tyr Trp Phe His 
165 170 175 

Thr Arg Val Ser Leu Val Asp Tyr Ser He Phe Thr Ala He Met Leu 
180 185 190 

Pro Thr He Phe Met Leu Ala Tyr Phe Gly Gly Leu Lys Leu Leu Thr 
195 200 205 

He Phe Ser Asn Val Lys Tyr Cys Ser Ala Met Leu Arg Leu Val Ala 
210 215 220 

Met Arg He Gin Phe Met Asp Arg Leu Asp Glu Arg Glu Ala Glu Lys 
225 230 235 240 

Glu Leu He Glu He He Val Met His Gin Lys Ala Leu Lys Cys Val 
245 250 255 

Glu Leu Leu Glu He lie Phe Arg Trp Val Phe Leu Gly Gin Phe He 
260 265 270 
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Gin Cys Val Met 
275 

Gly Leu Ser Thr 
290 

Thr Val Glu Thr 
305 

Glu Ala Ser Cys 



Tyr Arg Arg Ser 
340 

Arg Ala Gin Lys 
355 

Asp lie Glu Gin 
370 

lie Val Leu Lys 
385 



lie Trp Cys Ser 
280 

Lys Ala Ala Asn 
295 

Tyr Gly Phe Cys 
310 

Tyr Ser Leu Thr 
325 

Val Ser He Gin 



Pro Val Gly He 
360 

Phe Gly Asn Met 
375 

Asp Gin Phe 
390 
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Leu Val Leu Tyr 



Val Gly Val Leu 
300 

Tyr Phe Gly Ser 
315 

Arg Ala Ala Tyr 
330 

Arg Lys Leu Arg 
345 

Ser Ala Gly Lys 



Ala Lys Thr Ser 
380 



Val Ala Val Thr 
285 

Phe He Leu Leu 



Asp Leu Thr Ser 
320 

Gly Ser Leu Trp 
335 

Met Val Leu Gin 
350 

Phe Cys Phe Val 
365 

Tyr Ser Phe Tyr 



<210> 17 
<211> 474 
<212> DNA 

<213> Anopheles gambiae 
<400> 17 

ttatgcttac cggatgttgc gatcgcgcac gtgcttttcc gcatacgcca gtgcacactt 60 

gatggcggtg gtgatgacgt ctgctgcgca ccgttttctg ctcgtgagtc agaccttttc 120 

atttcctgca atatcctgtt tctttcccga ccccacagac ggttagacgg atatatgctg 180 

gtaaagtttg tcctcttcat gctgtgcttt ctgatcgagc tgctgatgct gtgtgcgtac 240 

ggtgaggata ttgtggaatc gccttggggt gattgatgcc gcttacggtt gcgaatggta 3 00 

cc 9"99 aa g99 tcggtggcgt tccatcgatc cgtgctgcaa attatacacc gcagccagca 3 60 

gtccgtcata ctgaccgcat ggaaaatttg gcccatccaa atgagtactt tcagtcagat 420 

cctgcaagct tcctggtcct actttaccct cctgaagacc gtctacggga ataa 474 



<210> 18 
<211> 157 
<212> PRT 

<213> Anopheles gambiae 
<400> 18 

Leu Cys Leu Pro Asp Val Ala He Ala His Val Leu Phe Arg He Arg 
15 10 15 

Gin Cys Thr Leu Asp Gly Gly Gly Asp Asp Val Cys Cys Ala Pro Phe 
20 25 30 

Ser Ala Arg Glu Ser Asp Leu Phe He Ser Cys Asn He Leu Phe Leu 
35 ' 40 45 



Ser Arg Pro His Arg Arg Leu Asp Gly Tyr Met Leu Val Lys Phe Val 
50 55 ~ 60 
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Leu Phe Met Leu Cys Phe Leu He 
65 70 

Gly Glu Asp He Val Glu Ser Pro 
85 

Leu Arg Met Val Pro Gly Arg Val 
100 

Ala Asn Tyr Thr Pro Gin Pro Ala 
115 120 

Asn Leu Ala His Pro Asn Glu Tyr 
13 0 135 

Leu Val Leu Leu Tyr Pro Pro Glu 
145 150 



Glu Leu Leu Met Leu Cys Ala Tyr 
75 80 

Trp Gly Asp Glx Cys Arg Leu Arg 
90 95 

Gly Gly Val Pro Ser He Arg Ala 
105 110 

Val Arg His Thr Asp Arg Met Glu 
125 

Phe Gin Ser Asp Pro Ala Ser Phe 
140 

Asp Arg Leu Arg Glu 
155 



<210> 19 
<211> 1206 
<212> DNA 

<2 13 > Anopheles gambiae 
<400> 19 

atggtgctga tccagttctt cgccatcctc ggcaacctgg cgacgaacgc ggacgacgtg 6 0 
aacgagctga ccgccaacac gatcacgacc ctgttcttca cgcactcggt caccaagttc 12 0 
atctactttg cggtcaactc ggagaacttc taccggacgc tcgccatctg gaaccagacc 18 0 
aacacgcacc cgctgtttgc cgaatcggac gcccggtacc attcgattgc gctcgccaag 24 0 
atgcggaagc tgctggtgct ggtgatggcc accaccgtcc tgtcggttgt cgcctgggtt 30 0 
acgataacat ttttcggcga gagcgtcaag actgtgctcg ataaggcaac caacgagacg 360 
tacacggtgg atataccccg gctgcccatc aagtcctggt atccgtggaa tgcaatgagc 42 0 
ggaccggcgt acattttctc tttcatctac caggtacgtt ggcggaatgg tattatgcga 480 
tcgttgatgg agctttcggc ctcgctggac acctaccggc ccaactcttc gcaactgttc 540 
cgagcaattt cagccggttc caaatcggag ctgatcatca acgaagaaaa ggatccggac 600 
gttaaggact ttgatctgag cggcatctac agctcgaagg cggactgggg cgcccagttc 660 
cgtgcgccgt cgacgctgca aacgttcgac gagaatggca ggaacggaaa tccgaacggg 720 
cttacccgga agcaggaaat gatggtgcgc agcgccatca agtactgggt cgagcggcac 780 
aagcacgttg tacgtctcgt ttcagcaatc ggagatacgt acggtcctgc cctgctgcta 840 
cacatgctga cctccaccat caagctgacg ctgctcgcct accaggcaac gaaaatcgac 900 
ggtgtcaacg tgtacggatt gaccgtaatc ggatatttgt gctacgcgtt ggctcaggtt 960 
ttcctgtttt gcatctttgg caatcggctc atcgaggaga gctcatccgt gatgaaggcg 102 0 
gcctattcct gccactggta cgacgggt cc gaggaggcaa aaaccttcgt ccagatcgtt 10 8 0 
tgtcagcagt gccagaaggc gatgactatt tccggagcca agtttttcac cgtttcgctc 114 0 
gatctgtttg cttcggttct tggagccgtt gtcacctact tcatggtgct ggtgcagctg 12 0 0 
aagtaa 12 06 



<210> 20 
<211> 401 
<212> PRT 

<213> Anopheles gambiae 



<400> 20 

Met Val Leu He Gin Phe Phe Ala He Leu Gly Asn Leu Ala Thr Asn 
1 5 io 15 
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Ala Asp Asp Val Asn Glu Leu Thr Ala Asn Thr He Thr Thr Leu Phe 
20 25 30 

Phe Thr His Ser Val Thr Lys Phe He Tyr Phe Ala Val Asn Ser Glu 
35 40 45 

Asn Phe Tyr Arg Thr Leu Ala He Trp Asn Gin Thr Asn Thr His Pro 
50 55 60 

Leu Phe Ala Glu Ser Asp Ala Arg Tyr His Ser He Ala Leu Ala Lys 
65 70 75 80 

Met Arg Lys Leu Leu Val Leu Val Met Ala Thr Thr Val Leu Ser Val 
85 90 95 

Val Ala Trp Val Thr He Thr Phe Phe Gly Glu Ser Val Lys Thr Val 
100 105 HO 

Leu Asp Lys Ala Thr Asn Glu Thr Tyr Thr Val Asp He Pro Arg Leu 
115 120 125 

Pro He Lys Ser Trp Tyr Pro Trp Asn Ala Met Ser Gly Pro Ala Tyr 
13 0 * 135 14 0 

He Phe Ser Phe He Tyr Gin Val Arg Trp Arg Asn Gly He Met Arg 
145 150 155 160 

Ser Leu Met Glu Leu Ser Ala Ser Leu Asp Thr Tyr Arg Pro Asn Ser 
165 170 175 

Ser Gin Leu Phe Arg Ala He Ser Ala Gly Ser Lys Ser Glu Leu He 
180 185 190 

He Asn Glu Glu Lys Asp Pro Asp Val Lys Asp Phe Asp Leu Ser Gly 
195 200 205 

He Tyr Ser Ser Lys Ala Asp Trp Gly Ala Gin Phe Arg Ala Pro Ser 
210 ~ 215 220 

Thr Leu Gin Thr Phe Asp Glu Asn Gly Arg Asn Gly Asn Pro Asn Gly 
225 230 235 240 

Leu Thr Arg Lys Gin Glu Met Met Val Arg Ser Ala He Lys Tyr Trp 
245 250 255 

Val Glu Arg His Lys His Val Val Arg Leu Val Ser Ala He Gly Asp 
260 265 270 

Thr Tyr Gly Pro Ala Leu Leu Leu His Met Leu Thr Ser Thr He Lys 
275 280 285 

Leu Thr Leu Leu Ala Tyr Gin Ala Thr Lys He Asp Gly Val Asn Val 
290 295 300 

Tyr Gly Leu Thr Val He Gly Tyr Leu Cys Tyr Ala Leu Ala Gin Val 
305 * 310 315 320 
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Phe Leu Phe Cys lie 
325 

Val Met Lys Ala Ala 
340 

Ala Lys Thr Phe Val 
355 

Thr lie Ser Gly Ala 
370 

Ser Val Leu Gly Ala 
385 

Lys 



Phe Gly Asn Arg Leu 
33 0 

Tyr Ser Cys His Trp 
345 

Gin lie Val Cys Gin 
360 

Lys Phe Phe Thr Val 
375 

Val Val Thr Tyr Phe 
390 



lie Glu Glu Ser Ser Ser 
335 

Tyr Asp Gly Ser Glu Glu 
350 

Gin Cys Gin Lys Ala Met 
365 

Ser Leu Asp Leu Phe Ala 
380 

Met Val Leu Val Gin Leu 
395 400 



<210> 21 
<211> 2272 
<212> DNA 

<2 13 > Anopheles gambiae 



<400> 21 

tctagacttg 

gggaccaccc 

gcctaccggg 

aacaacttgt 

ttcgcgaaag 

gttctgcaaa 

attgtttgca 

tgcttaagag 

gccttttagt 

gaagctgtcc 

gctgtggggt 

gctagtagtt 

aacagctgag 

gctagacgac 

ataatgattg 

acgttccctc 

ccaagatcta 

ccagcaccta 

tacacctgga 

tattcaccgc 

tgctaaccat 

gaatccagtt 

tcgtcatgca 

ttctagctgc 

gacagttcat 

taactaaaag 

cactcttccc 

acagtggaaa 

tattcgctga 

tttgagacag 

ggtttcgatt 

ctcggctggg 

tccactgtgg 

acatcatact 



aacccatgac 
gtttatcact 
ttttgtttct 
accttaaata 
caaaaatccg 
atcgtccaat 
tttcgttttt 
taaatacaat 
ccttcgaata 
gaaccgtacg 
gaacgacgct 
attccgaagg 
ctgattttcg 
tatgatgatc 
ataaaaggaa 
gcagatgggc 
ctgctgcagc 
cctagcgtac 
ggaggagctg 
catcatgctg 
cttcagcaac 
catggaccgg 
tcagaaggcg 
tttcagatgt 
acagtgcgta 
cactgtagtg 
agggtctcag 
cctacggatt 
gtttcagtta 
agcttgagcg 
caacggaagc 
aagttttgct 
caagaaagat 
cgttctacat 



gggcatttta 
atcactatta 
ctggatatct 
atcattacgt 
attgtctgat 
aatacggcaa 
tgcgtgcaaa 
tcgctgtcca 
catccgacca 
ccgtgatgcc 
atcgctacaa 
ttgccttcgg 
aatggaacgt 
tggtgtaccg 
cctttgagca 
gactatctgg 
catctgtgtt 
ctgggggcac 
tactggtttc 
cctacaatct 
gtgaagtact 
ctggacgagc 
ctaaagtaag 
gtggagctgt 
atgatctggt 
atctgtctgc 
cacaaaagcg 
ctgctacttt 
cttttccgtt 
tagcacgtgc 
ttcgaatggt 
tcgtcgacat 
tttctttatt 
cgttctgaag 



ttgagtcgtt 
attaattata 
taagttccca 
acccttaatc 
gttgtcttga 
tgtccttatc 
tatgttattt 
ttttttgtcc 
gtcagcaagc 
gcttctacta 
gttccggttg 
ctatccagat 
actgtttggg 
gtacaaggac 
actcctatcc 
tacgcatcaa 
tggccatctt 
gaaacagatc 
acacccgcgt 
ttatgctagc 
gttcggcaat 
gcgaagcgga 
gtctgccggt 
tggaaatcat 
gcagcttggt 
cacaccattc 
gcaaacgtgg 
ggcagtgatc 
cccctctaac 
tgcgtacggt 
actgcagcgt 
tgagcagttt 
aatgcatctt 
gatcaatttt 



cgagttgacg 
atatgctttt 
tttgattatc 
aacctgtgca 
ttccatccga 
gatgcttgaa 
gcaaagaagg 
accagtgtgc 
aagtgcatca 
cgcctgcagc 
gcatttttaa 
ttagagacaa 
atgttgctgt 
atatcaaaga 
ctttcaagct 
tcatcgtatc 
ctactgggtg 
cgtcccggtc 
ctcgctggta 
gtacttcggt 
gctcaggctt 
aaaggaactg 
atgttgtgga 
ctttcggtgg 
tctgtacgtc 
actgctgtgt 
gtgtactgtt 
ttacctcgga 
cgtaccactt 
agcctctggt 
gcccagaaac 
ggcaatgtat 
ttaatttaca 
aaaggggaac 



actgtaccac 
gtagcgatca 
aagatagaac 
tcaaggagtt 
ttcgttactg 
tcaacatcac 
caaggtaatg 
cagaacccgt 
tggtgctacc 
gtttcgttgg 
gcttctgtct 
tggttcgcgg 
tttctctcaa 
ttggtgcgtg 
ttccgtaagg 
gatcggtttt 
gctccttcgt 
gaacatgtgc 
gattactcca 
ggactaaagc 
gtggcgatga 
atcgaaatca 
tagaatacat 
gtttttctgg 
gccgttacgg 
cttgttttgt 
tatactgcta 
ggcaagttgt 
gtaccatttg 
atcgccgttc 
cggtcggcat 

ggggagacct 

gatggcaaaa 
tcccccaccc 



60 

12 0 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 
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gaccagacga cggaaagcta acgatgtgca 
gcaaacgaac taaccctttg actttttaag 
ttaaatcgag accgttgatg agcaaaagaa 
tccatcgact acataatcat aattatatgc 



attgaatagt cattagtagc gtttttgctc 210 0 
ttcactacgg tgaggacaaa aatcaataaa 2160 
aaaaaaatat tttactgatt ttcatttcgt 222 0 
cacattttat tataagtttt tg 2272 



<210> 22 
<211> 931 
<212> DNA 

<213> Anopheles gambiae 



<400> 22 

aacacccatc ttatcggcaa 
tgtttctcac tctctctctc 
ggctagttat gcttaccgga 
acacttgatg gcggtggtga 
cttttcattt cctgcaatat 
atgctggtaa agtttgtcct 
gcgtacggtg aggatattgt 
gtaattgaag cttttgcttt 
ggttgcgaat ggtaccggga 
caccgcagcc agcagtccgt 
actttcagtc aggtgagttg 
cgctctttcc cttagatcct 
tacgggaata agtaagcgcg 
atcaatagat ttctaatcat 
aatattgtac cattctatac 
cacgtttcga caagccgcgt 



aattagtatt taccgtttga 
tctgtctctc ttattgatgc 
tgttgcgatc gcgcacgtgc 
tgacgtctgc tgcgcaccgt 
cctgtttctt tcccgacccc 
cttcatgctg tgctttctga 
ggaatcggta aggcaccagg 
taaaacacat cagagccttg 
agggtcggtg gcgttccatc 
catactgacc gcatggaaaa 
ccaattgatt gccgtttgcg 
gcaagcttcc tggtcctact 
agagagagag agagagcagt 
gaaccattga aaaatgaatc 
agcttcacca cgaccaagcg 
cacctgctgg c 



aagcggcttc ccttcctggc 60 
cgtatgcgcc gcgtgctata 120 
ttttccgcat acgccagtgc 180 
tttctgctcg tgagtcagac 240 
acagacggtt agacggatat 3 00 
tcgagctgct gatgctgtgt 3 60 
cggtgatgag cgagtcgcga 420 
gggtgattga tgccgcttac 480 
gatccgtgct gcaaattata 540 
tttggcccat ccaaatgagt 600 
ttaatatttc agtaagagtg 660 
ttaccctcct gaagaccgtc 720 
atcgttcacc ctttggatga 780 
aacattttcg ctagttgcac 840 
tttgttgcat caggaccaaa 900 

931 



<210> 23 
<211> 11103 
<212> DNA 

<213> Anopheles gambiae 



<400> 23 

ccgcccgggc aggtgactta cgcggtctga cttgctggtg cgctgctttg tacggcaaac 60 
ggctacacaa gcgaatcgaa ttattttcct atcacgctgc gcttaccagc gcctgctggt 120 
aggcaaagaa tgtgcaaagt ttcatttggc ttggttcgtc tgctttgctg tgaacgtgtg 180 
cacggttgca tcgctaaggt ttcggtgtga gccgagaagt tgcagatcga aatctctttg 240 
tgtgtgtgtg tgtgtgtgca gtgggaagca ttgtgtttag tgagaagtga aaagaaaagt 3 00 
gctgaaaaat gcaagtccag ccgaccaagt acgtcggcct tcgttgccga cctgatgccg 3 60 
aacattcggg ttgatgcagg ccagcggtca actttctgtt ccggctacgt caccggcccg 420 
atactgatcc gcaaggtgta ctcctggtgg acgctcgccc atggtgctga tccagttctt 480 
cgccatcctc ggcaacctgg cgacgaacgc ggacgacgtg aacgagctga ccgccaacac 540 
gatcacgacc ctgttcttca cgcactcggt caccaagttc atctactttg cggtcaactc 600 
ggagaacttc taccggacgc tcgccatctg gaaccagacc aacacgcacc cgctgtttgc 660 
cgaatcggac gcccggtacc attcgattgc gctcgccaag atgcggaagc tgctggtgct 720 
ggtgatggcc accaccgtcc tgtcggttgt cggtatgtgt gtatgtgtgt ggccgtttgg 780 
gaaagtgtct ttgcggcaga accccaatct actgttacgc ttgactgggt ttttgttttt 840 
ttctcggtgg agggacggga taaaatatct gaaagaataa ttgagtcaac ccacaggggg 90 0 
atgcaagaca tcgcaggcag agagtttggg tttgatttat caccgcacac cgaatatctt 960 
cacggttcat aagcttcacc gcggtgaaaa gggaactccc catttccctg ttttcttttt 1020 
tttcttcctc tcgataaatt actcatcgct tttcgttttt ttttttttgt tgttgcttct 1080 
ttcttctttc atccctacta gcctgggtta cgataacatt tttcggcgag agcgtcaaga 1140 
ctgtgctcga taaggcaacc aacgagacgt acacggtgga tataccccgg ctgcccatca 1200 
agtcctggta tccgtggaat gcaatgagcg gaccggcgta cattttctct ttcatctacc 1260 
aggtacgttg gcggaatgtc ctgcgcgtca cagttggcag tcagtgagcg gcaacacggc 132 0 
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gaaaaaatgg 
ccttcgggcg 
tcgagcgaca 
tcgcttatca 
gtttcaaggt 
tcaacaaatt 
cggcaaaaga 
ctgatgattt 
accagcgcag 
aaatcgaacg 
caattatacc 
acgattaatt 
taacgaatgg 
tcctctgcaa 
tgtgtcgctg 
ttttaataca 
ccccattaaa 
accagtggaa 
gcatacagaa 
ttccgcttcg 
cacctaatgg 
agccacggtt 
attaatctgt 
catgccacgg 
ggttcgaccg 
ggtccagagc 
gctgcaacac 
cattatcctt 
tctctcacac 
ctctctctct 
ggagctttcg 
ttcagccggt 
gcagacggac 
cactcggcaa 
catcatccgt 
aggggaatga 
gaacggatgt 
attttccccg 
ccgtgcacag 
ttttttttgg 
aattgctatt 
cgcgcctgta 
aatggcccac 
acctttgcct 
tcggtgtatc 
cactttcatc 
tgcgtgtgtg 
gatcggagct 
tgtggagagt 
cttcttcgtc 
ccgagggaaa 
gctcagatgg 
tccaacccac 
ctgtgctcct 
acgtgtttga 
atgtgtgtgt 
tcctaaaaga 
tttccttttt 



gactaaaacc 
gtcgggactg 
aatgttgccg 
aactctatca 
caatcgagcg 
ctatgttctc 
aaagcgacga 
gcgggatggc 
gctcgtttgc 
atttttacga 
ctgatgttca 
cttttcaaag 
accgtacttg 
acttgtgctt 
tccgcttcct 
gccgtttgtg 
acaaagtgct 
gtgtcctttc 
aaaaaggaca 
ggatgtcata 
acttttcatg 
gatttcggcg 
accctcggag 
gacgggataa 
tgctacaaca 
aacctcgcgg 
ttgaaggtag 
atcgacgtgt 
cctcgatctc 
ctctctctct 
gcctcgctgg 
tccaaatcgg 
tcaaagagag 
gctaagcgag 
acgacatcat 
aagtgcgaca 
agtgtgcgaa 
tccatccgtg 
tactgtgcca 
agatttgttt 
tattataaag 
cgcctgaaac 
cgtaccacgc 
accatccaat 
gaacggtttt 
tcatgttttg 
tgtggttttt 
agtttattat 

gggggaagct 

gatggagatt 
ggtctttttg 
tagagcgctc 
acaaaacgtt 
ctaagttgga 
tagcaaacac 
attgttgtta 
attgtttgga 
attatttatt 



ggtcttcaca 
ggcaatgcag 
tgttagggct 
acggaggaaa 
ggtggggatc 
aatggcaaag 
ttatgaagat 
ttttacttgt 
cggcttgcgg 
ttctggatcc 
tttcattgca 
agattctttc 
gagggttgcg 
aattaattgg 
tccttcccag 
cattttaatt 
tccgggccca 
ccatcgtggg 
aatcctcctt 
aagtttgatg 
cttgagctaa 
gcggcctcat 
cgttagggcc 
tccgttggga 
cattttatgc 
atgtcatgtt 
gtacggtagc 
agtgttaacg 
tctttatttt 
ctctccatct 
acacctaccg 
agctgatcat 
cataacacaa 
acagtgggga 
cgctacgtac 
gaatgataaa 
gcgagcaaaa 
gtggagcgta 
cagttgtagg 
gcgttcgcat 
cgcttccaaa 
tatgcactgt 
ccgtggtgcc 
ccgtgtgaaa 
gtcccttttt 
cctgacggtg 
ttaaataacc 
cagctttagt 
taagtccaat 
ggtgcggttg 
taggcctagc 
gcttagcatg 
ttttaagaag 
agagtagatg 
acaaacaaca 
tgctgccttt 
gtcctctcag 
ccacgagcct 



gagccaacac 
ctacaacatc 
ttttgtgata 
tccattttcg 
aactttttta 
attactgccc 
gtccaaacca 
ctgctacttt 
aggttcttca 
agttttatga 
ttttgtaagt 
aaagagattc 
gaaagtaacg 
tgcacaataa 
caagctcgtg 
agcaaagcaa 
attgttatgg 
tacttcgcga 
gctatggtct 
ggtgttttta 
agttaaacca 
ccccagtttt 
cgcggacgag 
cggcgcgaaa 
ttcacagatt 
ctgctcctgg 
aaacgtggtt 
gtaaaagagg 
ctctctctct 
cctcgggcag 
gcccaactct 
caacgaaggt 
tcccctggta 
cagtgagaaa 
cggtatttca 
acaatcccca 
aaagtcaaat 
aagcccggcg 
gacggataag 
cgttagacga 
tagaagatcg 
gctgtgaaac 
caaagcgcaa 
ttgcccgctc 
tttactttgc 
gtgggttttc 
gctccaggtc 
gtttatccca 
gtaatttacc 
gcacgataaa 
aacggtcctc 
tgagaggtac 
atttttaggg 
agatgatgac 
atatcatctc 
gccatcttgt 
ttcctcgtaa 
ctgacataag 



attcctacag 
ctcgcctaaa 
atagtcgttt 
ctacaatgcc 
ttcattttgc 
gcaccaatcg 
ttgcccgccc 
caggcacaaa 
ggcactgagg 
tgtggcctgc 
ttgtgctggt 
aaaatgtgta 
ttttaaaata 
gtttaaactg 
cgaaataatt 
tataaaaagc 
cggtggaaag 
tattcttgtc 
aaggccagct 
acattacttc 
gccaccagcg 
gcgccaccaa 
tcctcgttgt 
gcgactatcg 
tacttcctgc 
ttgctgctag 
gtctttacat 
aagcgataaa 
ctctctctct 
ggtattatgc 
tcgcaactgt 
atgtgaaacg 
gttcatttca 
gagagaacaa 
ggatgaggaa 
cccaggcccc 
aaattgaagt 
gacaacttcg 
ctccgttcct 
gcttagtgcc 
gttctctcca 
cgtcaagctc 
cgcgaattgc 
tctttctctc 
tcttgatctc 
gaaaaaagag 
gtgttgaacg 
cccatgcccc 
gtgtttctgt 
agcccactgc 
attcaccgca 
cgggatcgat 
aagatattaa 
aagggagaag 
tgataataat 
ccctctctct 
agatcctttc 
tagccttccg 



caattgcata 
gttatgcaat 
ttttgtcctc 
tacagctcaa 
taacgcccca 
cccaacgaaa 
gacgctttat 
aggaaatgaa 
ctgagtactt 
attacagtgg 
aacgcccgta 
taacaaatgc 
ttcatcacaa 

tggcggcaga 

tattccatca 
agctaaccat 
taatggtttt 
ttatacaagt 
tcggtaccgc 
cgctcttaac 
gtacgcaccg 
tattgccttc 
aatgcaccgc 
cggacggatt 
tgttttcgat 
cctgcgagca 
ccgcgtgcag 
aaagcaacat 
ctctctctct 
gatcgttgat 
tccgagcaat 
tgtgctcgtg 
atgaccttaa 
gaaaaaaaac 
ataaaacgct 
cagcctggac 
ttaaaaatag 
agcacggcga 
tttttatcct 
gtgttgctct 
tttaatctat 
gagcacgacg 
atgttaacaa 
ttttgcgctt 
ttgctgtgct 
cgatttcttc 
ctgcaggacc 
acatcacgtc 
cgttcgtcac 
acgttacgga 
tgggggtgta 
acccggcatc 
cgcgggtaca 
gaacatgtgt 
ctgatgtgtg 
cctgttcaac 
gagattcttc 
cttatttcct 



1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
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tctccttgca cttgtcagtt ccgtgtagag cgtcattttg aggtttacac atttcccacc 4860 
gacgcctgat tgttacattg tcatctacat tgctttccgt ttaccgttcc gccctttttt 4920 
tttaacgcta ccacagaaaa ggatccggac gttaaggact ttgatctgag cggcatctac 4 98 0 
agctcgaagg cggactgggg cgcccagttc cgtgcgccgt cgacgctgca aacgttcgac 5 04 0 
gagaatggca ggaacggaaa tccgaacggg cttacccgga agcaggaaat gatggtgcgc 510 0 
agcgccatca agtactgggt cgagcggcac aagcacgttg tacggtaggt atggtaattt 516 0 
ctaaggtgtg gtgtaaagcc tccaggttcc atgaaaaagg gatactttac cacagtaaga 522 0 
gtttgttttg ctggacttac attctttgga gcattgtttg gtgttgtgct gaaaccggtt 528 0 
gcaatatcgt tttgcgaaga aattatgtgt aaagcgtatt acaatctcat tcctctgtta 534 0 
atctgtacca attgtgtcag ccccgaccga aagcaggcct aattcgtacc agaaaaacca 540 0 
caagctgttt gtaagcatcg atacgcccga agctttcaat ccagccaagg cgccacctac 5460 
tattgacgtg actttttgca cgttcacact ctccctctcc cattctttct ataaccaatc 5520 
gtcgctcagc cagcatcgcc cggagtgaag tttttatttg aacgatatca cccgtatcga 558 0 
ttttccacta aacatgctta aatcgtttca caaagctccc ccaaaatccc atttcaccaa 5640 
tccaccaatt tgaagtccgt cgtcctttgt gtccttgtgt ttgtgtgttt gtgtgagctg 5700 
gagacatggg ggagtgagta accgaacaac ctcttgccgc tgcttcacga tatcgaacag 5 76 0 
caccaagata agcatccctt tttccctagc cgatgtctcc gatatctcga ttccgcttcc 5820 
agcgaggcaa agaaaaaggc gaactggctg acctcacccg gggcgaggaa aaagcgtagg 5 88 0 
gattacgtcg agcagcacga gttgtgattt cttcttcttc tggttccata aatcgctgac 594 0 
ggtttccatt accgcctgcg gagtgcacac acgtgaaggg aaagcgaaaa cgtttagatt 6000 
ccagcagcaa cggcagcacc agaagcagca gcagcgcggc aaattgaatc atcctgacgc 6 06 0 
gatgagttgt ctgggttttc gggtcggtgg cttacagcac cacaccatct gctgcagcta 612 0 
atacagctgt aaatttcgtt agacatagac ttgattttac aatattacac acacacttac 6180 
acacacagct atagatttgt cgcttggcgt atggctctgt acggcgtgcc gtacatgccg 624 0 
cgagccgtgt tgctgctggt tgcgatacgg atcacgtccg attcgattca gcctgcgtgt 63 0 0 
ttttggtgaa gatccttatc ggtgacccac tttcagtgtg tcgagagcga gggtcactat 636 0 
ggcgcctgtc agttggaaag ctaggctcga ttcaaagggc cattgtgcca gtgttctttt 642 0 
taagatagcg ataagctttt gatcgaaata gtaaatcaaa cattgtttct tttttcctat 64 8 0 
tccaaactgt tgccaacctc attattacgt ttttgcagcg ggtgtatagt aaattgcata 6 540 
ctttaaggcg tgattttcaa atgtagcgtt ccgtatgcag aaacgccatg gattatgcaa 6600 
tttaaacaat gctgcttcct taacattcaa ataacggctt attaaggaac tttttgtgca 6660 
atttgttttt aacagcaaat agttagctca gaacgatcac atttagtatc gcttcaacaa 672 0 
agaactcttt taaacacaca atttgtaatg ccattccctc gagaaagttt cttgtcagtc 678 0 
ctcctctgca tcacagcaac aaccaaacct gctcatgttt cctgctcgtt tcctagctgt 6840 
tttgaacgtt atttccgatt cctgtgcttg cccgcttttc ttacaatcaa ccacaatggt 6900 
tcagatttcg ctcttatttt attgacccac tgctttcgtg ctgaagcccg tggaaacaat 6960 
gcgccaagct cagcatccag ccatgcatgt aaaatgagcc acgcgacaga ttttagacat 7 02 0 
cgctttcgct ctgcaccgga ggtggtttta ttcttgtttc cgattcccac gtccattcgt 7080 
cctgggtccg tccgccgggc ccgaaaccgt aagccgtgcg gggaattacg caatcgaaac 7140 
gagccagaaa atgagcacgc caaatgcaaa gaaaatcccc ttttgagtgg tgctcctgcc 7200 
accactcatc tccccaactg gtgggtgaaa aaccttgtgc gccccttctc tttccagaaa 7260 
aaaaacgcct cgctcgcaca aaaacatgct cgcccggtga agctgcgtat gtcgcagaag 732 0 
ctcaaaccaa cgccgccagc aagcatcaac aatttctatt caaacaccca acgcagcgcc 73 80 
caaaccgggt gcactgtact cagtagcgaa gatgctcaga ttgtcccgtg cgctgctttc 7440 
gatgcccgtt tcggagcggg aagccatcgc ttgccaacgt tggcgatgtc ttttagccgt 750 0 
ggatttgaat tttctgaata tcacaggcgg gcgcggtttg cctgcaaggt tgttgcttcc 7560 
cacacgagca ttgctttccg taccgcggtg gggcgagttt . tcaacgcaac cttctacaag 7620 
caacgccaca acgcctggga gcgatattta acagaaacaa gaacatcccg aacttcagca 7680 
catgccgtga tttgcctgtt ggaaaagctt ttgtgagcgt gtgagttgaa cgagctctat 7740 
tttcccagcg atgggtggca tttgtgtggc atgctatcgt cagcttttct tgaatcttta 7800 
cctctccatt cgcctccatt agtacacgcg tatggaaaat gggtgcaacg gatcagaacg 7860 
gattttccgc gacagactta ataaagggaa agcaacgcgt tttttgcatg tgtagtgttt 792 0 
atgagcttta tgccgbtact ttgcaattaa aaatagcaaa aaataacagt ttttttttgt 7980 
aagcggatta caaagaatgt atcagaatat tacgtgaaac attcatttca tgctgttaac 8040 
gctcaaatag aatagttttg taacacggat tgcatacctt gccggtatcg gttacatttt 810 0 
cgcctaacag tatgcaatct gtttagcttt gttgtttaat gactgcgttg gtagtacaat 8160 
atttatttac accgcgtaat ttatctcaca aattgcaaaa aaatgtcaat ctgtatcgat 8220 
tattcacaca aatcagatcc cggaaccagt gtagcccaat gtgctcttat tgaattacca 8280 
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cgaacaaatc 
tgtttcgtgc 
caactcttgt 
atttgacgca 
cgacaacaag 
gcgtgtggcg 
tatccggtag 

gtggtgaagc 

aagcaaacac 
atttgtgtta 
aaggaaaaag 
aaaaccgtcc 
ttgccacgtt 
cacatgctga 
ggtgtcaacg 
ttcctgtttt 
ccgtgggaaa 
tcacaaagcc 
tctcccatac 
ggtacgacgg 
aggcgatgac 
taagtgtagc 
ctgatgggtg 
ggttcttgga 
ggcccggaag 
tggacatttt 
cttttacaga 
tcctgtagca 
aattgaacca 
gtgatttgcg 
aaatgtttct 
acaaaagatg 
aaatcgtttt 
tcaaattaaa 
ataattaaat 
atcgcttcaa 
ttcatctttc 
gttggttgga 
agatggaaat 
caaaaccatt 
tcattccact 
acatataaaa 
ctgtgaaaga 
aaattaaccg 
ataccggaac 
tgcttttggt 
ctactttccc 
aat 



aacctgatgc 
actaccgtgc 
gagcatcgca 
tttacccgcg 
cacacacaca 
ctatgtggca 
cattcaattt 
tttcaaaggc 
ggtgccgcca 
ttcacctgcg 
cgactccatt 
tccatttcaa 
gcagtctcgt 
cctccaccat 
tgtacggatt. 
gcatctttgg 
gcattctccc 
agcacacttt 
ttctcccgtg 
gtccgaggag 
tatttccgga 
ctggtggctg 
gtatatgtgt 
gccgttgtca 
gatgtgtttt 
ctctactgca 
tctttgcaaa 
accggggctg 
ccagaagagt 
ctcatcaagc 
tagcgcaccg 
ataaaaataa 
agtatgatca 
ccgatgtttt 
tgattgctaa 
aagtattact 
catggccaga 
cgaagttggt 
gaatgtacca 
atctaaagta 
ttcttgctac 
ccttcatcac 
aagaagaaaa 
gaagaaattc 
aaacggtgtg 
tctgtgtttt 
ggggccgcaa 



ccgggtccgt 
tgccattttg 
atgcccgtct 
aacaattgcg 
aacacaaatg 
tgccgattcc 
ccttttctat 
attgtgaaac 
tcgctgctac 
tatctatgcg 
tgggattggt 

atgcctacac 
ttcagcaatc 
caagctgacg 
gaccgtaatc 
caatcggctc 
tgccccatat 
tgcttcgccg 
cagagctcat 
gcaaaaacct 
gccaagtttt 
gcacagaaca 
gtctattttt 
cctacttcat 
ttttcgctcg 
aaggtttaac 
atgattagat 
aagaacgttg 
gatatttatg 
actgtatgtg 
tacattgtcg 
ataataacaa 
tacctccaat 
actttctgtg 
ctttatgcgt 
accacattat 
actactgcag 
aacaaacggc 
ctagaaccga 
cgcacaactt 
actttccgac 
tcaagctgta 
aaaaaacact 
gcaaaacccc 
cgcgaaagaa 
tcttccactg 
ttttctgcag 



tggcaaacag 
ctgccctcat 
gaagttccgt 
cgaaggctgt 
ttatcgtttc 
cagacagagt 
cctcgcaaac 
aaatgtcctg 
cgtcaatcga 
tccgtcgtgt 
ttttgcagcg 
ttgtcactgt 
ggagatacgt 
ctgctcgcct 
ggatatttgt 
atcgaggagg 
cgcttcattc 
ctgccatctc 
ccgtgatgaa 
tcgtccagat 
tcaccgtttc 
ggctggcaaa 
tgctaccatt 
ggtgctggtg 
ttcggttgtt 
aaacagcaac 
tttaatagat 
atttggtaaa 
caaagctcac 
cctttcaact 
tttcggcgtt 
aatgttaata 
catttgtttg 
agaattattg 
ttttcaattt 
tcatttactt 
aaaagcttct 
aagcaattag 
gtgaaatgaa 
aaaaacaaac 
cgagttctgt 
tcgagccagc 
tccacgggaa 
gcaccgacgt 
tccgctagca 
gtttgggtgc 
cccaaggcgg 



cttgcgccga 
cgaacagata 
cgaaaatggg 
caagtgtgtt 
ggcatgtttc 
gatcgatagt 
aaagcccatt 
gttcggaggg 
tcafcgcatga 
cgttcggatt 
aaaaatcaaa 
atatctctct 
acggtcctgc 
accaggcaac 
gctacgcgtt 
tacgtgcgct 
tcccagatca 
ggcttctgaa 
ggcggcctat 
cgtttgtcag 
gctcgatctg 
acagggactt 
ctcgcatccc 
cagctgaagt 
tgtttgtgca 
aacaaataat 
taacagtgct 
agtacaaaag 
caagggaaat 
agtgcagcaa 
ttaaccgttg 
tgagtaagta 
aaattaactt 
tggaagaact 
acgaacgcta 
atagttatat 
tttttgctcg 
cataaactat 
ttacttttca 
cccaaattgt 
agcgccagca 
gtgggttgtg 
gctagcaatt 
accgcaccgc 
gccccactgg 
ctgggcgaag 
cgtgctcgtg 



agccgctcag 
aacagaaggg 
cctaaattca 
ccacgaactg 
tcggtacaaa 
aaatgtagcc 
ctggggaggc 
atgctgggga 
tgtgattaat 
tccggaagtc 
acattcgcac 
ttctctcgtt 
cctgctgcta 
gaaaatcgac 
ggctcaggtt 
cggcgtgttg 
cacatttgca 
tgttttcact 
tcctgccact 
cagtgccaga 
tttgcttcgg 
tggctctagc 
ttcctttcca 
aaacagccgt 
cactttctct 
cccaagtttt 
tgattatctg 
ggacgttgga 
ctatgtatgt 
taaagagtac 
ttgataatac 
ctaaatagag 
taattttaac 
taatggaagt 
gtcttcaaac 
ttattgcctc 
ctttccgatg 
tttcgcatcg 
acttgcacgc 
cgtccaccct 
gcaaaaaaat 
tttgactgtg 
ggaaatgcat 
atccgtaccg 
cacgggtatt 
gctagctcgg 
gggccaaaag 
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